Dataset statistics
| Number of variables | 21 |
|---|---|
| Number of observations | 45379 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 20 |
| Duplicate rows (%) | < 0.1% |
| Total size in memory | 8.5 MiB |
| Average record size in memory | 195.3 B |
Variable types
| Categorical | 9 |
|---|---|
| Numeric | 8 |
| Unsupported | 3 |
| DateTime | 1 |
| Dataset has 20 (< 0.1%) duplicate rows | Duplicates |
belongs_to_collection has a high cardinality: 1699 distinct values | High cardinality |
genres has a high cardinality: 4068 distinct values | High cardinality |
original_language has a high cardinality: 93 distinct values | High cardinality |
overview has a high cardinality: 44234 distinct values | High cardinality |
spoken_languages has a high cardinality: 1842 distinct values | High cardinality |
tagline has a high cardinality: 20270 distinct values | High cardinality |
title has a high cardinality: 42197 distinct values | High cardinality |
name_collection has a high cardinality: 1696 distinct values | High cardinality |
budget is highly overall correlated with revenue | High correlation |
revenue is highly overall correlated with budget and 1 other fields | High correlation |
vote_count is highly overall correlated with revenue | High correlation |
belongs_to_collection is highly imbalanced (86.1%) | Imbalance |
original_language is highly imbalanced (67.7%) | Imbalance |
spoken_languages is highly imbalanced (61.4%) | Imbalance |
status is highly imbalanced (96.6%) | Imbalance |
name_collection is highly imbalanced (86.1%) | Imbalance |
return is highly skewed (γ1 = 138.3340992) | Skewed |
title is uniformly distributed | Uniform |
popularity is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
production_companies is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
production_countries is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
budget has 36493 (80.4%) zeros | Zeros |
revenue has 37972 (83.7%) zeros | Zeros |
runtime has 1784 (3.9%) zeros | Zeros |
vote_average has 2950 (6.5%) zeros | Zeros |
vote_count has 2852 (6.3%) zeros | Zeros |
return has 40005 (88.2%) zeros | Zeros |
Reproduction
| Analysis started | 2023-07-11 17:35:06.206067 |
|---|---|
| Analysis finished | 2023-07-11 17:35:37.357543 |
| Duration | 31.15 seconds |
| Software version | pandas-profiling v3.6.6 |
| Download configuration | config.json |
belongs_to_collection
Categorical
HIGH CARDINALITY  IMBALANCE 
| Distinct | 1699 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| {'name':'sin datos' } | |
|---|---|
| {'id': 415931, 'name': 'The Bowery Boys', 'poster_path': '/q6sA4bzMT9cK7EEmXYwt7PNrL5h.jpg', 'backdrop_path': '/foe3kuiJmg5AklhtD3skWbaTMf2.jpg'} | 29 |
| {'id': 421566, 'name': 'Totò Collection', 'poster_path': '/4ayJsjC3djGwU9eCWUokdBWvdLC.jpg', 'backdrop_path': '/jaUuprubvAxXLAY5hUfrNjxccUh.jpg'} | 27 |
| {'id': 645, 'name': 'James Bond Collection', 'poster_path': '/HORpg5CSkmeQlAolx3bKMrKgfi.jpg', 'backdrop_path': '/6VcVl48kNKvdXOZfJPdarlUGOsk.jpg'} | 26 |
| {'id': 96887, 'name': 'Zatôichi: The Blind Swordsman', 'poster_path': '/8Q31DAtmFJjhFTwQGXghBUCgWK2.jpg', 'backdrop_path': '/bY8gLImMR5Pr9PaG3ZpobfaAQ8N.jpg'} | 26 |
| Other values (1694) |
Length
| Max length | 184 |
|---|---|
| Median length | 21 |
| Mean length | 32.915313 |
| Min length | 8 |
Characters and Unicode
| Total characters | 1493664 |
|---|---|
| Distinct characters | 170 |
| Distinct categories | 13 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 8 ? |
Unique
| Unique | 393 ? |
|---|---|
| Unique (%) | 0.9% |
Sample
| 1st row | {'id': 10194, 'name': 'Toy Story Collection', 'poster_path': '/7G9915LfUQ2lVfwMEEhDsn3kT4B.jpg', 'backdrop_path': '/9FBwqcd9IRruEDUrTdcaafOMKUq.jpg'} |
|---|---|
| 2nd row | {'name':'sin datos' } |
| 3rd row | {'id': 119050, 'name': 'Grumpy Old Men Collection', 'poster_path': '/nLvUdqgPgm3F85NMCii9gVFUcet.jpg', 'backdrop_path': '/hypTnLot2z8wpFS7qwsQHW1uV8u.jpg'} |
| 4th row | {'name':'sin datos' } |
| 5th row | {'id': 96871, 'name': 'Father of the Bride Collection', 'poster_path': '/nts4iOmNnq7GNicycMJ9pSAn204.jpg', 'backdrop_path': '/7qwE57OVZmMJChBpLEbJEmzUydk.jpg'} |
Common Values
| Value | Count | Frequency (%) |
| {'name':'sin datos' } | 40888 | |
| {'id': 415931, 'name': 'The Bowery Boys', 'poster_path': '/q6sA4bzMT9cK7EEmXYwt7PNrL5h.jpg', 'backdrop_path': '/foe3kuiJmg5AklhtD3skWbaTMf2.jpg'} | 29 | 0.1% |
| {'id': 421566, 'name': 'Totò Collection', 'poster_path': '/4ayJsjC3djGwU9eCWUokdBWvdLC.jpg', 'backdrop_path': '/jaUuprubvAxXLAY5hUfrNjxccUh.jpg'} | 27 | 0.1% |
| {'id': 645, 'name': 'James Bond Collection', 'poster_path': '/HORpg5CSkmeQlAolx3bKMrKgfi.jpg', 'backdrop_path': '/6VcVl48kNKvdXOZfJPdarlUGOsk.jpg'} | 26 | 0.1% |
| {'id': 96887, 'name': 'Zatôichi: The Blind Swordsman', 'poster_path': '/8Q31DAtmFJjhFTwQGXghBUCgWK2.jpg', 'backdrop_path': '/bY8gLImMR5Pr9PaG3ZpobfaAQ8N.jpg'} | 26 | 0.1% |
| {'id': 37261, 'name': 'The Carry On Collection', 'poster_path': '/2P0HNrYgKDvirV8RCdT1rBSJdbJ.jpg', 'backdrop_path': '/38tF1LJN7ULeZAuAfP7beaPMfcl.jpg'} | 25 | 0.1% |
| {'id': 34055, 'name': 'Pokémon Collection', 'poster_path': '/j5te0YNZAMXDBnsqTUDKIBEt8iu.jpg', 'backdrop_path': '/iGoYKA0TFfgSoZpG2u5viTJMGfK.jpg'} | 22 | < 0.1% |
| {'id': 413661, 'name': 'Charlie Chan (Sidney Toler) Collection', 'poster_path': '/y0xWQpLRattvypZXF5ZiuipsD2U.jpg', 'backdrop_path': None} | 21 | < 0.1% |
| {'id': 374509, 'name': 'Godzilla (Showa) Collection', 'poster_path': '/scvwS6k8gIW8w24UcmePQqVL10l.jpg', 'backdrop_path': '/dx9YSup5zEOjxYwG4UkYBVAZIXo.jpg'} | 16 | < 0.1% |
| {'id': 425164, 'name': 'Dragon Ball Z (Movie) Collection', 'poster_path': '/2VMZ1zRFPnUQtQp5K4WRXvDYBjh.jpg', 'backdrop_path': '/7PcbijxTfwi9vjWEfXdS0ReAw8q.jpg'} | 15 | < 0.1% |
| Other values (1689) | 4284 | 9.4% |
Length
| Value | Count | Frequency (%) |
| 41027 | ||
| name':'sin | 40888 | |
| datos | 40888 | |
| name | 4494 | 2.7% |
| id | 4488 | 2.7% |
| poster_path | 4488 | 2.7% |
| backdrop_path | 4488 | 2.7% |
| collection | 3743 | 2.2% |
| none | 1771 | 1.0% |
| the | 1146 | 0.7% |
| Other values (6636) | 21446 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 222735 | |
| 123489 | 8.3% | |
| a | 107467 | 7.2% |
| n | 98493 | 6.6% |
| s | 91920 | 6.2% |
| o | 65914 | 4.4% |
| e | 65100 | 4.4% |
| t | 64074 | 4.3% |
| : | 58939 | 3.9% |
| i | 56212 | 3.8% |
| Other values (160) | 539321 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 807531 | |
| Other Punctuation | 310135 | 20.8% |
| Space Separator | 123489 | 8.3% |
| Uppercase Letter | 94971 | 6.4% |
| Decimal Number | 56927 | 3.8% |
| Close Punctuation | 45711 | 3.1% |
| Open Punctuation | 45711 | 3.1% |
| Connector Punctuation | 8976 | 0.6% |
| Dash Punctuation | 162 | < 0.1% |
| Other Letter | 37 | < 0.1% |
| Other values (3) | 14 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 107467 | |
| n | 98493 | |
| s | 91920 | |
| o | 65914 | |
| e | 65100 | |
| t | 64074 | |
| i | 56212 | |
| d | 54579 | |
| m | 49913 | 6.2% |
| p | 29059 | 3.6% |
| Other values (69) | 124800 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 7690 | 8.1% |
| N | 5091 | 5.4% |
| T | 4593 | 4.8% |
| S | 4187 | 4.4% |
| A | 3721 | 3.9% |
| M | 3693 | 3.9% |
| B | 3679 | 3.9% |
| D | 3678 | 3.9% |
| L | 3478 | 3.7% |
| G | 3457 | 3.6% |
| Other values (33) | 51704 |
Other Letter
| Value | Count | Frequency (%) |
| は | 3 | 8.1% |
| つ | 3 | 8.1% |
| ら | 3 | 8.1% |
| い | 3 | 8.1% |
| よ | 3 | 8.1% |
| シ | 3 | 8.1% |
| 男 | 3 | 8.1% |
| リ | 3 | 8.1% |
| ズ | 3 | 8.1% |
| 즈 | 2 | 5.4% |
| Other values (4) | 8 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 222735 | |
| : | 58939 | 19.0% |
| , | 13543 | 4.4% |
| . | 7380 | 2.4% |
| / | 7226 | 2.3% |
| " | 214 | 0.1% |
| & | 52 | < 0.1% |
| ! | 35 | < 0.1% |
| * | 4 | < 0.1% |
| ? | 4 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 6788 | |
| 2 | 6102 | |
| 3 | 5869 | |
| 4 | 5779 | |
| 5 | 5701 | |
| 9 | 5478 | |
| 8 | 5451 | |
| 6 | 5368 | |
| 7 | 5345 | |
| 0 | 5046 |
Close Punctuation
| Value | Count | Frequency (%) |
| } | 45376 | |
| ) | 330 | 0.7% |
| ] | 5 | < 0.1% |
Open Punctuation
| Value | Count | Frequency (%) |
| { | 45376 | |
| ( | 330 | 0.7% |
| [ | 5 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 160 | |
| – | 2 | 1.2% |
Space Separator
| Value | Count | Frequency (%) |
| 123489 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 8976 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 9 |
Modifier Letter
| Value | Count | Frequency (%) |
| ー | 3 |
Other Number
| Value | Count | Frequency (%) |
| ½ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 902088 | |
| Common | 591125 | |
| Cyrillic | 414 | < 0.1% |
| Hiragana | 15 | < 0.1% |
| Hangul | 10 | < 0.1% |
| Katakana | 9 | < 0.1% |
| Han | 3 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 107467 | |
| n | 98493 | |
| s | 91920 | |
| o | 65914 | 7.3% |
| e | 65100 | 7.2% |
| t | 64074 | 7.1% |
| i | 56212 | 6.2% |
| d | 54579 | 6.1% |
| m | 49913 | 5.5% |
| p | 29059 | 3.2% |
| Other values (70) | 219357 |
Cyrillic
| Value | Count | Frequency (%) |
| л | 48 | 11.6% |
| и | 41 | 9.9% |
| о | 37 | 8.9% |
| к | 30 | 7.2% |
| е | 27 | 6.5% |
| я | 25 | 6.0% |
| а | 17 | 4.1% |
| ц | 16 | 3.9% |
| К | 16 | 3.9% |
| р | 14 | 3.4% |
| Other values (32) | 143 |
Common
| Value | Count | Frequency (%) |
| ' | 222735 | |
| 123489 | ||
| : | 58939 | 10.0% |
| } | 45376 | 7.7% |
| { | 45376 | 7.7% |
| , | 13543 | 2.3% |
| _ | 8976 | 1.5% |
| . | 7380 | 1.2% |
| / | 7226 | 1.2% |
| 1 | 6788 | 1.1% |
| Other values (24) | 51297 | 8.7% |
Hiragana
| Value | Count | Frequency (%) |
| は | 3 | |
| つ | 3 | |
| ら | 3 | |
| い | 3 | |
| よ | 3 |
Hangul
| Value | Count | Frequency (%) |
| 즈 | 2 | |
| 시 | 2 | |
| 리 | 2 | |
| 식 | 2 | |
| 객 | 2 |
Katakana
| Value | Count | Frequency (%) |
| シ | 3 | |
| リ | 3 | |
| ズ | 3 |
Han
| Value | Count | Frequency (%) |
| 男 | 3 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1492950 | |
| Cyrillic | 414 | < 0.1% |
| None | 246 | < 0.1% |
| Hiragana | 15 | < 0.1% |
| Punctuation | 14 | < 0.1% |
| Katakana | 12 | < 0.1% |
| Hangul | 10 | < 0.1% |
| CJK | 3 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| ' | 222735 | |
| 123489 | 8.3% | |
| a | 107467 | 7.2% |
| n | 98493 | 6.6% |
| s | 91920 | 6.2% |
| o | 65914 | 4.4% |
| e | 65100 | 4.4% |
| t | 64074 | 4.3% |
| : | 58939 | 3.9% |
| i | 56212 | 3.8% |
| Other values (71) | 538607 |
Cyrillic
| Value | Count | Frequency (%) |
| л | 48 | 11.6% |
| и | 41 | 9.9% |
| о | 37 | 8.9% |
| к | 30 | 7.2% |
| е | 27 | 6.5% |
| я | 25 | 6.0% |
| а | 17 | 4.1% |
| ц | 16 | 3.9% |
| К | 16 | 3.9% |
| р | 14 | 3.4% |
| Other values (32) | 143 |
None
| Value | Count | Frequency (%) |
| é | 45 | |
| ä | 40 | |
| ô | 35 | |
| ò | 28 | |
| ö | 19 | |
| ó | 14 | 5.7% |
| ı | 14 | 5.7% |
| í | 9 | 3.7% |
| á | 4 | 1.6% |
| İ | 4 | 1.6% |
| Other values (19) | 34 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 9 | |
| … | 3 | 21.4% |
| – | 2 | 14.3% |
Hiragana
| Value | Count | Frequency (%) |
| は | 3 | |
| つ | 3 | |
| ら | 3 | |
| い | 3 | |
| よ | 3 |
Katakana
| Value | Count | Frequency (%) |
| シ | 3 | |
| リ | 3 | |
| ー | 3 | |
| ズ | 3 |
CJK
| Value | Count | Frequency (%) |
| 男 | 3 |
Hangul
| Value | Count | Frequency (%) |
| 즈 | 2 | |
| 시 | 2 | |
| 리 | 2 | |
| 식 | 2 | |
| 객 | 2 |
budget
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 1223 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4232324.6 |
| Minimum | 0 |
|---|---|
| Maximum | 3.8 × 108 |
| Zeros | 36493 |
| Zeros (%) | 80.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 25000000 |
| Maximum | 3.8 × 108 |
| Range | 3.8 × 108 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 17439317 |
|---|---|
| Coefficient of variation (CV) | 4.1205056 |
| Kurtosis | 66.63901 |
| Mean | 4232324.6 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 7.1185794 |
| Sum | 1.9205866 × 1011 |
| Variance | 3.0412978 × 1014 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 36493 | |
| 5000000 | 286 | 0.6% |
| 10000000 | 259 | 0.6% |
| 20000000 | 243 | 0.5% |
| 2000000 | 242 | 0.5% |
| 15000000 | 226 | 0.5% |
| 3000000 | 223 | 0.5% |
| 25000000 | 206 | 0.5% |
| 1000000 | 197 | 0.4% |
| 30000000 | 190 | 0.4% |
| Other values (1213) | 6814 | 15.0% |
| Value | Count | Frequency (%) |
| 0 | 36493 | |
| 1 | 25 | 0.1% |
| 2 | 14 | < 0.1% |
| 3 | 9 | < 0.1% |
| 4 | 8 | < 0.1% |
| 5 | 8 | < 0.1% |
| 6 | 5 | < 0.1% |
| 7 | 4 | < 0.1% |
| 8 | 5 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 380000000 | 1 | < 0.1% |
| 300000000 | 1 | < 0.1% |
| 280000000 | 1 | < 0.1% |
| 270000000 | 1 | < 0.1% |
| 260000000 | 3 | < 0.1% |
| 258000000 | 1 | < 0.1% |
| 255000000 | 1 | < 0.1% |
| 250000000 | 10 | |
| 245000000 | 2 | < 0.1% |
| 237000000 | 1 | < 0.1% |
genres
Categorical
| Distinct | 4068 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| Drama | |
|---|---|
| Comedy | |
| Documentary | 2713 |
| sin datos | 2384 |
| Drama, Romance | 1301 |
| Other values (4063) |
Length
| Max length | 84 |
|---|---|
| Median length | 68 |
| Mean length | 16.07477 |
| Min length | 3 |
Characters and Unicode
| Total characters | 729457 |
|---|---|
| Distinct characters | 40 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2367 ? |
|---|---|
| Unique (%) | 5.2% |
Sample
| 1st row | Animation, Comedy, Family |
|---|---|
| 2nd row | Adventure, Fantasy, Family |
| 3rd row | Romance, Comedy |
| 4th row | Comedy, Drama, Romance |
| 5th row | Comedy |
Common Values
| Value | Count | Frequency (%) |
| Drama | 4998 | 11.0% |
| Comedy | 3621 | 8.0% |
| Documentary | 2713 | 6.0% |
| sin datos | 2384 | 5.3% |
| Drama, Romance | 1301 | 2.9% |
| Comedy, Drama | 1135 | 2.5% |
| Horror | 974 | 2.1% |
| Comedy, Romance | 930 | 2.0% |
| Comedy, Drama, Romance | 593 | 1.3% |
| Drama, Comedy | 532 | 1.2% |
| Other values (4058) | 26198 |
Length
| Value | Count | Frequency (%) |
| drama | 20255 | |
| comedy | 13181 | |
| thriller | 7619 | 7.6% |
| romance | 6733 | 6.8% |
| action | 6592 | 6.6% |
| horror | 4670 | 4.7% |
| crime | 4305 | 4.3% |
| documentary | 3921 | 3.9% |
| adventure | 3494 | 3.5% |
| science | 3042 | 3.1% |
| Other values (38) | 25827 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 69082 | 9.5% |
| a | 64206 | 8.8% |
| e | 55786 | 7.6% |
| 54260 | 7.4% | |
| m | 53101 | 7.3% |
| o | 50925 | 7.0% |
| , | 48053 | 6.6% |
| i | 42054 | 5.8% |
| n | 38060 | 5.2% |
| t | 28594 | 3.9% |
| Other values (30) | 225336 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 531500 | |
| Uppercase Letter | 95644 | 13.1% |
| Space Separator | 54260 | 7.4% |
| Other Punctuation | 48053 | 6.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 69082 | |
| a | 64206 | |
| e | 55786 | |
| m | 53101 | |
| o | 50925 | |
| i | 42054 | |
| n | 38060 | |
| t | 28594 | 5.4% |
| y | 28510 | 5.4% |
| c | 27977 | 5.3% |
| Other values (12) | 73205 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 24176 | |
| C | 17489 | |
| A | 12020 | |
| F | 9746 | |
| T | 8389 | 8.8% |
| R | 6735 | 7.0% |
| H | 6068 | 6.3% |
| M | 4830 | 5.0% |
| S | 3046 | 3.2% |
| W | 2365 | 2.5% |
| Other values (6) | 780 | 0.8% |
Space Separator
| Value | Count | Frequency (%) |
| 54260 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 48053 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 627144 | |
| Common | 102313 | 14.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 69082 | |
| a | 64206 | 10.2% |
| e | 55786 | 8.9% |
| m | 53101 | 8.5% |
| o | 50925 | 8.1% |
| i | 42054 | 6.7% |
| n | 38060 | 6.1% |
| t | 28594 | 4.6% |
| y | 28510 | 4.5% |
| c | 27977 | 4.5% |
| Other values (28) | 168849 |
Common
| Value | Count | Frequency (%) |
| 54260 | ||
| , | 48053 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 729457 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 69082 | 9.5% |
| a | 64206 | 8.8% |
| e | 55786 | 7.6% |
| 54260 | 7.4% | |
| m | 53101 | 7.3% |
| o | 50925 | 7.0% |
| , | 48053 | 6.6% |
| i | 42054 | 5.8% |
| n | 38060 | 5.2% |
| t | 28594 | 3.9% |
| Other values (30) | 225336 |
id
Real number (ℝ)
| Distinct | 45347 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 108019.96 |
| Minimum | 1 |
|---|---|
| Maximum | 469172 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5333.9 |
| Q1 | 26380.5 |
| median | 59853 |
| Q3 | 156472 |
| 95-th percentile | 357170.8 |
| Maximum | 469172 |
| Range | 469171 |
| Interquartile range (IQR) | 130091.5 |
Descriptive statistics
| Standard deviation | 112168.11 |
|---|---|
| Coefficient of variation (CV) | 1.0384017 |
| Kurtosis | 0.55969805 |
| Mean | 108019.96 |
| Median Absolute Deviation (MAD) | 44421 |
| Skewness | 1.2831253 |
| Sum | 4.9018379 × 109 |
| Variance | 1.2581685 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 141971 | 3 | < 0.1% |
| 265189 | 2 | < 0.1% |
| 119916 | 2 | < 0.1% |
| 152795 | 2 | < 0.1% |
| 84198 | 2 | < 0.1% |
| 159849 | 2 | < 0.1% |
| 42495 | 2 | < 0.1% |
| 110428 | 2 | < 0.1% |
| 97995 | 2 | < 0.1% |
| 132641 | 2 | < 0.1% |
| Other values (45337) | 45358 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 11 | 1 | |
| 12 | 2 | |
| 13 | 1 | |
| 14 | 1 | |
| 15 | 1 |
| Value | Count | Frequency (%) |
| 469172 | 1 | |
| 468707 | 1 | |
| 468343 | 1 | |
| 467731 | 1 | |
| 465044 | 1 | |
| 464819 | 1 | |
| 464207 | 1 | |
| 464111 | 1 | |
| 463906 | 1 | |
| 463800 | 1 |
original_language
Categorical
HIGH CARDINALITY  IMBALANCE 
| Distinct | 93 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| en | |
|---|---|
| fr | 2437 |
| it | 1528 |
| ja | 1349 |
| de | 1078 |
| Other values (88) |
Length
| Max length | 9 |
|---|---|
| Median length | 2 |
| Mean length | 2.0018511 |
| Min length | 2 |
Characters and Unicode
| Total characters | 90842 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 20 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | en |
|---|---|
| 2nd row | en |
| 3rd row | en |
| 4th row | en |
| 5th row | en |
Common Values
| Value | Count | Frequency (%) |
| en | 32202 | |
| fr | 2437 | 5.4% |
| it | 1528 | 3.4% |
| ja | 1349 | 3.0% |
| de | 1078 | 2.4% |
| es | 992 | 2.2% |
| ru | 822 | 1.8% |
| hi | 508 | 1.1% |
| ko | 444 | 1.0% |
| zh | 408 | 0.9% |
| Other values (83) | 3611 | 8.0% |
Length
| Value | Count | Frequency (%) |
| en | 32202 | |
| fr | 2437 | 5.4% |
| it | 1528 | 3.4% |
| ja | 1349 | 3.0% |
| de | 1078 | 2.4% |
| es | 992 | 2.2% |
| ru | 822 | 1.8% |
| hi | 508 | 1.1% |
| ko | 444 | 1.0% |
| zh | 408 | 0.9% |
| Other values (84) | 3622 | 8.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 34527 | |
| n | 32921 | |
| r | 3630 | 4.0% |
| f | 2835 | 3.1% |
| i | 2399 | 2.6% |
| t | 2261 | 2.5% |
| a | 1850 | 2.0% |
| s | 1674 | 1.8% |
| j | 1350 | 1.5% |
| d | 1334 | 1.5% |
| Other values (24) | 6061 | 6.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 90818 | |
| Space Separator | 11 | < 0.1% |
| Decimal Number | 10 | < 0.1% |
| Other Punctuation | 3 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 34527 | |
| n | 32921 | |
| r | 3630 | 4.0% |
| f | 2835 | 3.1% |
| i | 2399 | 2.6% |
| t | 2261 | 2.5% |
| a | 1850 | 2.0% |
| s | 1674 | 1.8% |
| j | 1350 | 1.5% |
| d | 1334 | 1.5% |
| Other values (16) | 6037 | 6.6% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4 | |
| 8 | 2 | |
| 2 | 1 | 10.0% |
| 6 | 1 | 10.0% |
| 1 | 1 | 10.0% |
| 4 | 1 | 10.0% |
Space Separator
| Value | Count | Frequency (%) |
| 11 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 90818 | |
| Common | 24 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 34527 | |
| n | 32921 | |
| r | 3630 | 4.0% |
| f | 2835 | 3.1% |
| i | 2399 | 2.6% |
| t | 2261 | 2.5% |
| a | 1850 | 2.0% |
| s | 1674 | 1.8% |
| j | 1350 | 1.5% |
| d | 1334 | 1.5% |
| Other values (16) | 6037 | 6.6% |
Common
| Value | Count | Frequency (%) |
| 11 | ||
| 0 | 4 | 16.7% |
| . | 3 | 12.5% |
| 8 | 2 | 8.3% |
| 2 | 1 | 4.2% |
| 6 | 1 | 4.2% |
| 1 | 1 | 4.2% |
| 4 | 1 | 4.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 90842 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 34527 | |
| n | 32921 | |
| r | 3630 | 4.0% |
| f | 2835 | 3.1% |
| i | 2399 | 2.6% |
| t | 2261 | 2.5% |
| a | 1850 | 2.0% |
| s | 1674 | 1.8% |
| j | 1350 | 1.5% |
| d | 1334 | 1.5% |
| Other values (24) | 6061 | 6.7% |
overview
Categorical
| Distinct | 44234 |
|---|---|
| Distinct (%) | 97.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| sin datos | 941 |
|---|---|
| No overview found. | 133 |
| No Overview | 7 |
| 5 | |
| Released | 3 |
| Other values (44229) |
Length
| Max length | 1000 |
|---|---|
| Median length | 790 |
| Mean length | 316.75881 |
| Min length | 1 |
Characters and Unicode
| Total characters | 14374198 |
|---|---|
| Distinct characters | 429 |
| Distinct categories | 25 ? |
| Distinct scripts | 13 ? |
| Distinct blocks | 21 ? |
Unique
| Unique | 44173 ? |
|---|---|
| Unique (%) | 97.3% |
Sample
| 1st row | Led by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences. |
|---|---|
| 2nd row | When siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures. |
| 3rd row | A family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max. |
| 4th row | Cheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive "good man" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe. |
| 5th row | Just when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own. |
Common Values
| Value | Count | Frequency (%) |
| sin datos | 941 | 2.1% |
| No overview found. | 133 | 0.3% |
| No Overview | 7 | < 0.1% |
| 5 | < 0.1% | |
| Released | 3 | < 0.1% |
| Recovering from a nail gun shot to the head and 13 months of coma, doctor Pekka Valinta starts to unravel the mystery of his past, still suffering from total amnesia. | 3 | < 0.1% |
| King Lear, old and tired, divides his kingdom among his daughters, giving great importance to their protestations of love for him. When Cordelia, youngest and most honest, refuses to idly flatter the old man in return for favor, he banishes her and turns for support to his remaining daughters. But Goneril and Regan have no love for him and instead plot to take all his power from him. In a parallel, Lear's loyal courtier Gloucester favors his illegitimate son Edmund after being told lies about his faithful son Edgar. Madness and tragedy befall both ill-starred fathers. | 3 | < 0.1% |
| No movie overview available. | 3 | < 0.1% |
| Adaptation of the Jane Austen novel. | 3 | < 0.1% |
| A few funny little novels about different aspects of life. | 3 | < 0.1% |
| Other values (44224) | 44275 |
Length
| Value | Count | Frequency (%) |
| the | 138082 | 5.6% |
| a | 98889 | 4.0% |
| and | 75259 | 3.1% |
| to | 73321 | 3.0% |
| of | 69574 | 2.8% |
| in | 48143 | 2.0% |
| is | 36500 | 1.5% |
| his | 36165 | 1.5% |
| with | 23902 | 1.0% |
| her | 21484 | 0.9% |
| Other values (97092) | 1829274 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2407291 | ||
| e | 1363796 | 9.5% |
| a | 941446 | 6.5% |
| t | 935707 | 6.5% |
| i | 852455 | 5.9% |
| o | 830814 | 5.8% |
| n | 823542 | 5.7% |
| s | 769736 | 5.4% |
| r | 744274 | 5.2% |
| h | 600810 | 4.2% |
| Other values (419) | 4104327 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11157610 | |
| Space Separator | 2407329 | 16.7% |
| Uppercase Letter | 390965 | 2.7% |
| Other Punctuation | 312824 | 2.2% |
| Decimal Number | 42223 | 0.3% |
| Dash Punctuation | 36767 | 0.3% |
| Close Punctuation | 10100 | 0.1% |
| Open Punctuation | 10077 | 0.1% |
| Final Punctuation | 4556 | < 0.1% |
| Initial Punctuation | 882 | < 0.1% |
| Other values (15) | 865 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1363796 | |
| a | 941446 | 8.4% |
| t | 935707 | 8.4% |
| i | 852455 | 7.6% |
| o | 830814 | 7.4% |
| n | 823542 | 7.4% |
| s | 769736 | 6.9% |
| r | 744274 | 6.7% |
| h | 600810 | 5.4% |
| l | 478816 | 4.3% |
| Other values (142) | 2816214 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 42751 | 10.9% |
| T | 35968 | 9.2% |
| S | 31126 | 8.0% |
| M | 23954 | 6.1% |
| B | 23699 | 6.1% |
| C | 22803 | 5.8% |
| H | 19429 | 5.0% |
| W | 18652 | 4.8% |
| I | 16798 | 4.3% |
| D | 16311 | 4.2% |
| Other values (77) | 139474 |
Other Letter
| Value | Count | Frequency (%) |
| न | 6 | 4.8% |
| र | 6 | 4.8% |
| म | 5 | 4.0% |
| の | 4 | 3.2% |
| द | 3 | 2.4% |
| प | 3 | 2.4% |
| ద | 3 | 2.4% |
| अ | 3 | 2.4% |
| व | 2 | 1.6% |
| م | 2 | 1.6% |
| Other values (76) | 88 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 133443 | |
| . | 124794 | |
| ' | 31121 | 9.9% |
| " | 11661 | 3.7% |
| : | 3299 | 1.1% |
| ? | 2759 | 0.9% |
| ; | 2493 | 0.8% |
| ! | 1543 | 0.5% |
| / | 765 | 0.2% |
| & | 453 | 0.1% |
| Other values (12) | 493 | 0.2% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ́ | 4 | |
| ి | 4 | |
| ் | 3 | |
| ్ | 3 | |
| ् | 3 | |
| ̈ | 3 | |
| ా | 2 | 6.1% |
| े | 2 | 6.1% |
| ं | 2 | 6.1% |
| ु | 2 | 6.1% |
| Other values (4) | 5 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 9748 | |
| 0 | 8265 | |
| 9 | 6405 | |
| 2 | 4251 | |
| 5 | 2440 | 5.8% |
| 8 | 2379 | 5.6% |
| 3 | 2342 | 5.5% |
| 4 | 2176 | 5.2% |
| 7 | 2131 | 5.0% |
| 6 | 2086 | 4.9% |
Spacing Mark
| Value | Count | Frequency (%) |
| ा | 11 | |
| ी | 4 | 14.8% |
| ो | 3 | 11.1% |
| ు | 3 | 11.1% |
| ि | 2 | 7.4% |
| ு | 2 | 7.4% |
| ం | 1 | 3.7% |
| ி | 1 | 3.7% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 35244 | |
| – | 881 | 2.4% |
| — | 633 | 1.7% |
| ― | 5 | < 0.1% |
| ‐ | 4 | < 0.1% |
Other Symbol
| Value | Count | Frequency (%) |
| ® | 45 | |
| ™ | 14 | 21.9% |
| ¦ | 2 | 3.1% |
| ° | 2 | 3.1% |
| � | 1 | 1.6% |
Math Symbol
| Value | Count | Frequency (%) |
| ~ | 20 | |
| + | 11 | |
| = | 6 | 15.0% |
| | | 2 | 5.0% |
| − | 1 | 2.5% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 10024 | |
| [ | 50 | 0.5% |
| { | 2 | < 0.1% |
| „ | 1 | < 0.1% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 317 | |
| £ | 10 | 3.0% |
| ₹ | 1 | 0.3% |
| € | 1 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 2407291 | ||
| 36 | < 0.1% | |
| 2 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 10048 | |
| ] | 50 | 0.5% |
| } | 2 | < 0.1% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 3847 | |
| ” | 690 | 15.1% |
| » | 19 | 0.4% |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 672 | |
| ‘ | 192 | 21.8% |
| « | 18 | 2.0% |
Control
| Value | Count | Frequency (%) |
| 106 | ||
| | 3 | 2.7% |
| | 1 | 0.9% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ´ | 25 | |
| ` | 12 | |
| ¯ | 1 | 2.6% |
Format
| Value | Count | Frequency (%) |
| | 31 | |
| | 20 |
Other Number
| Value | Count | Frequency (%) |
| ½ | 8 | |
| ¹ | 8 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 19 |
Line Separator
| Value | Count | Frequency (%) |
| 7 |
Letter Number
| Value | Count | Frequency (%) |
| Ⅱ | 2 |
Paragraph Separator
| Value | Count | Frequency (%) |
| 2 |
Modifier Letter
| Value | Count | Frequency (%) |
| ʼ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11543343 | |
| Common | 2825436 | 19.7% |
| Cyrillic | 4587 | < 0.1% |
| Greek | 648 | < 0.1% |
| Devanagari | 77 | < 0.1% |
| Telugu | 30 | < 0.1% |
| Hiragana | 20 | < 0.1% |
| Tamil | 19 | < 0.1% |
| Han | 10 | < 0.1% |
| Hangul | 9 | < 0.1% |
| Other values (3) | 19 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1363796 | |
| a | 941446 | 8.2% |
| t | 935707 | 8.1% |
| i | 852455 | 7.4% |
| o | 830814 | 7.2% |
| n | 823542 | 7.1% |
| s | 769736 | 6.7% |
| r | 744274 | 6.4% |
| h | 600810 | 5.2% |
| l | 478816 | 4.1% |
| Other values (132) | 3201947 |
Common
| Value | Count | Frequency (%) |
| 2407291 | ||
| , | 133443 | 4.7% |
| . | 124794 | 4.4% |
| - | 35244 | 1.2% |
| ' | 31121 | 1.1% |
| " | 11661 | 0.4% |
| ) | 10048 | 0.4% |
| ( | 10024 | 0.4% |
| 1 | 9748 | 0.3% |
| 0 | 8265 | 0.3% |
| Other values (71) | 43797 | 1.6% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 470 | 10.2% |
| е | 404 | 8.8% |
| а | 373 | 8.1% |
| н | 323 | 7.0% |
| и | 299 | 6.5% |
| т | 265 | 5.8% |
| р | 240 | 5.2% |
| с | 218 | 4.8% |
| в | 173 | 3.8% |
| л | 161 | 3.5% |
| Other values (46) | 1661 |
Greek
| Value | Count | Frequency (%) |
| α | 60 | 9.3% |
| ο | 55 | 8.5% |
| τ | 43 | 6.6% |
| ι | 36 | 5.6% |
| η | 36 | 5.6% |
| ν | 34 | 5.2% |
| ε | 31 | 4.8% |
| ρ | 31 | 4.8% |
| π | 30 | 4.6% |
| ς | 30 | 4.6% |
| Other values (33) | 262 |
Devanagari
| Value | Count | Frequency (%) |
| ा | 11 | 14.3% |
| न | 6 | 7.8% |
| र | 6 | 7.8% |
| म | 5 | 6.5% |
| ी | 4 | 5.2% |
| द | 3 | 3.9% |
| ो | 3 | 3.9% |
| ् | 3 | 3.9% |
| प | 3 | 3.9% |
| अ | 3 | 3.9% |
| Other values (21) | 30 |
Hiragana
| Value | Count | Frequency (%) |
| の | 4 | |
| さ | 1 | 5.0% |
| ん | 1 | 5.0% |
| と | 1 | 5.0% |
| そ | 1 | 5.0% |
| め | 1 | 5.0% |
| ひ | 1 | 5.0% |
| ち | 1 | 5.0% |
| ず | 1 | 5.0% |
| か | 1 | 5.0% |
| Other values (7) | 7 |
Telugu
| Value | Count | Frequency (%) |
| ి | 4 | |
| ్ | 3 | |
| ు | 3 | |
| ద | 3 | |
| ా | 2 | 6.7% |
| న | 2 | 6.7% |
| స | 2 | 6.7% |
| మ | 2 | 6.7% |
| ర | 2 | 6.7% |
| బ | 1 | 3.3% |
| Other values (6) | 6 |
Tamil
| Value | Count | Frequency (%) |
| ் | 3 | |
| ம | 2 | |
| ர | 2 | |
| ு | 2 | |
| ப | 2 | |
| ன | 1 | 5.3% |
| வ | 1 | 5.3% |
| த | 1 | 5.3% |
| ஆ | 1 | 5.3% |
| ய | 1 | 5.3% |
| Other values (3) | 3 |
Han
| Value | Count | Frequency (%) |
| 俣 | 1 | |
| 界 | 1 | |
| 患 | 1 | |
| 者 | 1 | |
| 世 | 1 | |
| 水 | 1 | |
| 鬼 | 1 | |
| 見 | 1 | |
| 難 | 1 | |
| 海 | 1 |
Hangul
| Value | Count | Frequency (%) |
| 사 | 2 | |
| 회 | 1 | |
| 식 | 1 | |
| 주 | 1 | |
| 기 | 1 | |
| 찾 | 1 | |
| 랑 | 1 | |
| 첫 | 1 |
Thai
| Value | Count | Frequency (%) |
| ่ | 2 | |
| ง | 1 | |
| ร | 1 | |
| พ | 1 | |
| แ | 1 | |
| ี | 1 | |
| ส | 1 |
Arabic
| Value | Count | Frequency (%) |
| م | 2 | |
| ہ | 1 | |
| ت | 1 |
Inherited
| Value | Count | Frequency (%) |
| ́ | 4 | |
| ̈ | 3 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14356200 | |
| Punctuation | 7270 | 0.1% |
| None | 5930 | < 0.1% |
| Cyrillic | 4587 | < 0.1% |
| Devanagari | 77 | < 0.1% |
| Telugu | 30 | < 0.1% |
| Hiragana | 20 | < 0.1% |
| Tamil | 19 | < 0.1% |
| Letterlike Symbols | 14 | < 0.1% |
| CJK | 10 | < 0.1% |
| Other values (11) | 41 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2407291 | ||
| e | 1363796 | 9.5% |
| a | 941446 | 6.6% |
| t | 935707 | 6.5% |
| i | 852455 | 5.9% |
| o | 830814 | 5.8% |
| n | 823542 | 5.7% |
| s | 769736 | 5.4% |
| r | 744274 | 5.2% |
| h | 600810 | 4.2% |
| Other values (82) | 4086329 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 3847 | |
| – | 881 | 12.1% |
| ” | 690 | 9.5% |
| “ | 672 | 9.2% |
| — | 633 | 8.7% |
| … | 303 | 4.2% |
| ‘ | 192 | 2.6% |
| | 31 | 0.4% |
| 7 | 0.1% | |
| ― | 5 | 0.1% |
| Other values (4) | 9 | 0.1% |
None
| Value | Count | Frequency (%) |
| é | 1552 | |
| ä | 294 | 5.0% |
| á | 293 | 4.9% |
| ö | 250 | 4.2% |
| í | 243 | 4.1% |
| è | 209 | 3.5% |
| ü | 178 | 3.0% |
| ı | 165 | 2.8% |
| ó | 164 | 2.8% |
| ç | 158 | 2.7% |
| Other values (141) | 2424 |
Cyrillic
| Value | Count | Frequency (%) |
| о | 470 | 10.2% |
| е | 404 | 8.8% |
| а | 373 | 8.1% |
| н | 323 | 7.0% |
| и | 299 | 6.5% |
| т | 265 | 5.8% |
| р | 240 | 5.2% |
| с | 218 | 4.8% |
| в | 173 | 3.8% |
| л | 161 | 3.5% |
| Other values (46) | 1661 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| ™ | 14 |
Devanagari
| Value | Count | Frequency (%) |
| ा | 11 | 14.3% |
| न | 6 | 7.8% |
| र | 6 | 7.8% |
| म | 5 | 6.5% |
| ी | 4 | 5.2% |
| द | 3 | 3.9% |
| ो | 3 | 3.9% |
| ् | 3 | 3.9% |
| प | 3 | 3.9% |
| अ | 3 | 3.9% |
| Other values (21) | 30 |
Alphabetic PF
| Value | Count | Frequency (%) |
| fi | 4 |
Hiragana
| Value | Count | Frequency (%) |
| の | 4 | |
| さ | 1 | 5.0% |
| ん | 1 | 5.0% |
| と | 1 | 5.0% |
| そ | 1 | 5.0% |
| め | 1 | 5.0% |
| ひ | 1 | 5.0% |
| ち | 1 | 5.0% |
| ず | 1 | 5.0% |
| か | 1 | 5.0% |
| Other values (7) | 7 |
Diacriticals
| Value | Count | Frequency (%) |
| ́ | 4 | |
| ̈ | 3 |
Telugu
| Value | Count | Frequency (%) |
| ి | 4 | |
| ్ | 3 | |
| ు | 3 | |
| ద | 3 | |
| ా | 2 | 6.7% |
| న | 2 | 6.7% |
| స | 2 | 6.7% |
| మ | 2 | 6.7% |
| ర | 2 | 6.7% |
| బ | 1 | 3.3% |
| Other values (6) | 6 |
Tamil
| Value | Count | Frequency (%) |
| ் | 3 | |
| ம | 2 | |
| ர | 2 | |
| ு | 2 | |
| ப | 2 | |
| ன | 1 | 5.3% |
| வ | 1 | 5.3% |
| த | 1 | 5.3% |
| ஆ | 1 | 5.3% |
| ய | 1 | 5.3% |
| Other values (3) | 3 |
Arabic
| Value | Count | Frequency (%) |
| م | 2 | |
| ہ | 1 | |
| ت | 1 |
Hangul
| Value | Count | Frequency (%) |
| 사 | 2 | |
| 회 | 1 | |
| 식 | 1 | |
| 주 | 1 | |
| 기 | 1 | |
| 찾 | 1 | |
| 랑 | 1 | |
| 첫 | 1 |
Number Forms
| Value | Count | Frequency (%) |
| Ⅱ | 2 |
Modifier Letters
| Value | Count | Frequency (%) |
| ʼ | 2 |
Thai
| Value | Count | Frequency (%) |
| ่ | 2 | |
| ง | 1 | |
| ร | 1 | |
| พ | 1 | |
| แ | 1 | |
| ี | 1 | |
| ส | 1 |
CJK
| Value | Count | Frequency (%) |
| 俣 | 1 | |
| 界 | 1 | |
| 患 | 1 | |
| 者 | 1 | |
| 世 | 1 | |
| 水 | 1 | |
| 鬼 | 1 | |
| 見 | 1 | |
| 難 | 1 | |
| 海 | 1 |
Math Operators
| Value | Count | Frequency (%) |
| − | 1 |
Katakana
| Value | Count | Frequency (%) |
| ・ | 1 |
Currency Symbols
| Value | Count | Frequency (%) |
| ₹ | 1 | |
| € | 1 |
Specials
| Value | Count | Frequency (%) |
| � | 1 |
popularity
Unsupported
REJECTED  UNSUPPORTED 
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
production_companies
Unsupported
REJECTED  UNSUPPORTED 
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
production_countries
Unsupported
REJECTED  UNSUPPORTED 
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
release_date
Date
| Distinct | 17334 |
|---|---|
| Distinct (%) | 38.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| Minimum | 1874-12-09 00:00:00 |
|---|---|
| Maximum | 2020-12-16 00:00:00 |
revenue
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 6863 |
|---|---|
| Distinct (%) | 15.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11229357 |
| Minimum | 0 |
|---|---|
| Maximum | 2.7879651 × 109 |
| Zeros | 37972 |
| Zeros (%) | 83.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 48018459 |
| Maximum | 2.7879651 × 109 |
| Range | 2.7879651 × 109 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 64387893 |
|---|---|
| Coefficient of variation (CV) | 5.7338897 |
| Kurtosis | 237.09288 |
| Mean | 11229357 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 12.255124 |
| Sum | 5.0957698 × 1011 |
| Variance | 4.1458008 × 1015 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 37972 | |
| 12000000 | 20 | < 0.1% |
| 10000000 | 19 | < 0.1% |
| 11000000 | 19 | < 0.1% |
| 2000000 | 18 | < 0.1% |
| 6000000 | 17 | < 0.1% |
| 5000000 | 14 | < 0.1% |
| 8000000 | 13 | < 0.1% |
| 500000 | 13 | < 0.1% |
| 1 | 12 | < 0.1% |
| Other values (6853) | 7262 | 16.0% |
| Value | Count | Frequency (%) |
| 0 | 37972 | |
| 1 | 12 | < 0.1% |
| 2 | 3 | < 0.1% |
| 3 | 9 | < 0.1% |
| 4 | 4 | < 0.1% |
| 5 | 5 | < 0.1% |
| 6 | 2 | < 0.1% |
| 7 | 4 | < 0.1% |
| 8 | 5 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 2787965087 | 1 | |
| 2068223624 | 1 | |
| 1845034188 | 1 | |
| 1519557910 | 1 | |
| 1513528810 | 1 | |
| 1506249360 | 1 | |
| 1405403694 | 1 | |
| 1342000000 | 1 | |
| 1274219009 | 1 | |
| 1262886337 | 1 |
runtime
Real number (ℝ)
| Distinct | 353 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 93.664889 |
| Minimum | 0 |
|---|---|
| Maximum | 1256 |
| Zeros | 1784 |
| Zeros (%) | 3.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 8 |
| Q1 | 85 |
| median | 95 |
| Q3 | 107 |
| 95-th percentile | 138 |
| Maximum | 1256 |
| Range | 1256 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 38.863558 |
|---|---|
| Coefficient of variation (CV) | 0.4149213 |
| Kurtosis | 88.727685 |
| Mean | 93.664889 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 4.2501696 |
| Sum | 4250419 |
| Variance | 1510.3761 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 2549 | 5.6% |
| 0 | 1784 | 3.9% |
| 100 | 1470 | 3.2% |
| 95 | 1410 | 3.1% |
| 93 | 1214 | 2.7% |
| 96 | 1104 | 2.4% |
| 92 | 1079 | 2.4% |
| 94 | 1062 | 2.3% |
| 91 | 1055 | 2.3% |
| 88 | 1030 | 2.3% |
| Other values (343) | 31622 |
| Value | Count | Frequency (%) |
| 0 | 1784 | |
| 1 | 107 | 0.2% |
| 2 | 33 | 0.1% |
| 3 | 48 | 0.1% |
| 4 | 50 | 0.1% |
| 5 | 51 | 0.1% |
| 6 | 72 | 0.2% |
| 7 | 103 | 0.2% |
| 8 | 78 | 0.2% |
| 9 | 63 | 0.1% |
| Value | Count | Frequency (%) |
| 1256 | 1 | |
| 1140 | 2 | |
| 931 | 1 | |
| 925 | 1 | |
| 900 | 1 | |
| 877 | 1 | |
| 874 | 1 | |
| 840 | 2 | |
| 780 | 1 | |
| 720 | 1 |
spoken_languages
Categorical
HIGH CARDINALITY  IMBALANCE 
| Distinct | 1842 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| English | |
|---|---|
| sin datos | |
| Français | 1852 |
| 日本語 | 1289 |
| Italiano | 1217 |
| Other values (1837) |
Length
| Max length | 171 |
|---|---|
| Median length | 7 |
| Mean length | 9.3637145 |
| Min length | 2 |
Characters and Unicode
| Total characters | 424916 |
|---|---|
| Distinct characters | 171 |
| Distinct categories | 8 ? |
| Distinct scripts | 15 ? |
| Distinct blocks | 16 ? |
Unique
| Unique | 1293 ? |
|---|---|
| Unique (%) | 2.8% |
Sample
| 1st row | English |
|---|---|
| 2nd row | English, Français |
| 3rd row | English |
| 4th row | English |
| 5th row | English |
Common Values
| Value | Count | Frequency (%) |
| English | 22380 | |
| sin datos | 3894 | 8.6% |
| Français | 1852 | 4.1% |
| 日本語 | 1289 | 2.8% |
| Italiano | 1217 | 2.7% |
| Español | 901 | 2.0% |
| Pусский | 807 | 1.8% |
| Deutsch | 761 | 1.7% |
| English, Français | 681 | 1.5% |
| English, Español | 572 | 1.3% |
| Other values (1832) | 11025 |
Length
| Value | Count | Frequency (%) |
| english | 28729 | |
| français | 4194 | 6.7% |
| sin | 3894 | 6.3% |
| datos | 3894 | 6.3% |
| deutsch | 2624 | 4.2% |
| español | 2412 | 3.9% |
| italiano | 2366 | 3.8% |
| 日本語 | 1758 | 2.8% |
| pусский | 1562 | 2.5% |
| 普通话 | 790 | 1.3% |
| Other values (71) | 9935 | 16.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 50058 | |
| n | 41356 | 9.7% |
| i | 41003 | 9.6% |
| l | 34631 | 8.2% |
| h | 31459 | 7.4% |
| E | 31198 | 7.3% |
| g | 30413 | 7.2% |
| a | 22840 | 5.4% |
| 16973 | 4.0% | |
| , | 11666 | 2.7% |
| Other values (161) | 113319 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 323180 | |
| Uppercase Letter | 46428 | 10.9% |
| Other Letter | 22191 | 5.2% |
| Space Separator | 16973 | 4.0% |
| Other Punctuation | 12731 | 3.0% |
| Spacing Mark | 1838 | 0.4% |
| Nonspacing Mark | 1549 | 0.4% |
| Control | 26 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 50058 | |
| n | 41356 | |
| i | 41003 | |
| l | 34631 | |
| h | 31459 | |
| g | 30413 | |
| a | 22840 | |
| o | 10947 | 3.4% |
| t | 9871 | 3.1% |
| r | 6128 | 1.9% |
| Other values (63) | 44474 |
Other Letter
| Value | Count | Frequency (%) |
| 語 | 1758 | 7.9% |
| 本 | 1758 | 7.9% |
| 日 | 1758 | 7.9% |
| 话 | 1263 | 5.7% |
| 州 | 946 | 4.3% |
| 普 | 790 | 3.6% |
| 通 | 790 | 3.6% |
| ह | 707 | 3.2% |
| द | 707 | 3.2% |
| न | 707 | 3.2% |
| Other values (46) | 11007 |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 31198 | |
| F | 4196 | 9.0% |
| D | 2926 | 6.3% |
| P | 2677 | 5.8% |
| I | 2366 | 5.1% |
| N | 829 | 1.8% |
| L | 505 | 1.1% |
| M | 362 | 0.8% |
| T | 308 | 0.7% |
| Č | 284 | 0.6% |
| Other values (13) | 777 | 1.7% |
Spacing Mark
| Value | Count | Frequency (%) |
| ि | 707 | |
| ी | 707 | |
| ు | 136 | 7.4% |
| ி | 111 | 6.0% |
| া | 94 | 5.1% |
| ং | 47 | 2.6% |
| ਾ | 18 | 1.0% |
| ੀ | 18 | 1.0% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ् | 707 | |
| ִ | 430 | |
| ְ | 215 | 13.9% |
| ் | 111 | 7.2% |
| ె | 68 | 4.4% |
| ੰ | 18 | 1.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 11666 | |
| / | 1015 | 8.0% |
| ? | 50 | 0.4% |
Space Separator
| Value | Count | Frequency (%) |
| 16973 |
Control
| Value | Count | Frequency (%) |
| | 26 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 357219 | |
| Common | 29730 | 7.0% |
| Han | 10482 | 2.5% |
| Cyrillic | 10454 | 2.5% |
| Devanagari | 4242 | 1.0% |
| Arabic | 3344 | 0.8% |
| Hangul | 3252 | 0.8% |
| Hebrew | 1720 | 0.4% |
| Greek | 1704 | 0.4% |
| Thai | 1232 | 0.3% |
| Other values (5) | 1537 | 0.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 50058 | |
| n | 41356 | |
| i | 41003 | |
| l | 34631 | |
| h | 31459 | |
| E | 31198 | |
| g | 30413 | |
| a | 22840 | |
| o | 10947 | 3.1% |
| t | 9871 | 2.8% |
| Other values (50) | 53443 |
Cyrillic
| Value | Count | Frequency (%) |
| с | 3211 | |
| к | 1734 | |
| и | 1679 | |
| й | 1615 | |
| у | 1564 | |
| а | 113 | 1.1% |
| р | 87 | 0.8% |
| У | 53 | 0.5% |
| ї | 53 | 0.5% |
| н | 53 | 0.5% |
| Other values (12) | 292 | 2.8% |
Arabic
| Value | Count | Frequency (%) |
| ا | 537 | |
| ر | 537 | |
| ل | 341 | |
| ع | 341 | |
| ب | 341 | |
| ي | 341 | |
| ة | 341 | |
| ی | 141 | 4.2% |
| ف | 141 | 4.2% |
| س | 141 | 4.2% |
| Other values (5) | 142 | 4.2% |
Han
| Value | Count | Frequency (%) |
| 語 | 1758 | |
| 本 | 1758 | |
| 日 | 1758 | |
| 话 | 1263 | |
| 州 | 946 | |
| 普 | 790 | |
| 通 | 790 | |
| 話 | 473 | 4.5% |
| 廣 | 473 | 4.5% |
| 广 | 473 | 4.5% |
Hebrew
| Value | Count | Frequency (%) |
| ִ | 430 | |
| ת | 215 | |
| י | 215 | |
| ר | 215 | |
| ְ | 215 | |
| ב | 215 | |
| ע | 215 |
Greek
| Value | Count | Frequency (%) |
| λ | 426 | |
| ά | 213 | |
| κ | 213 | |
| ι | 213 | |
| ν | 213 | |
| ε | 213 | |
| η | 213 |
Georgian
| Value | Count | Frequency (%) |
| ლ | 33 | |
| ა | 33 | |
| ი | 33 | |
| უ | 33 | |
| თ | 33 | |
| რ | 33 | |
| ქ | 33 |
Devanagari
| Value | Count | Frequency (%) |
| ि | 707 | |
| ह | 707 | |
| ी | 707 | |
| द | 707 | |
| ् | 707 | |
| न | 707 |
Hangul
| Value | Count | Frequency (%) |
| 국 | 542 | |
| 어 | 542 | |
| 선 | 542 | |
| 말 | 542 | |
| 한 | 542 | |
| 조 | 542 |
Thai
| Value | Count | Frequency (%) |
| า | 352 | |
| ท | 176 | |
| ย | 176 | |
| ไ | 176 | |
| ษ | 176 | |
| ภ | 176 |
Gurmukhi
| Value | Count | Frequency (%) |
| ਾ | 18 | |
| ੀ | 18 | |
| ਬ | 18 | |
| ਜ | 18 | |
| ੰ | 18 | |
| ਪ | 18 |
Common
| Value | Count | Frequency (%) |
| 16973 | ||
| , | 11666 | |
| / | 1015 | 3.4% |
| ? | 50 | 0.2% |
| | 26 | 0.1% |
Telugu
| Value | Count | Frequency (%) |
| ు | 136 | |
| గ | 68 | |
| ల | 68 | |
| ె | 68 | |
| త | 68 |
Tamil
| Value | Count | Frequency (%) |
| த | 111 | |
| ி | 111 | |
| ழ | 111 | |
| ் | 111 | |
| ம | 111 |
Bengali
| Value | Count | Frequency (%) |
| া | 94 | |
| ল | 47 | |
| ং | 47 | |
| ব | 47 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 378093 | |
| CJK | 10482 | 2.5% |
| Cyrillic | 10454 | 2.5% |
| None | 10434 | 2.5% |
| Devanagari | 4242 | 1.0% |
| Arabic | 3344 | 0.8% |
| Hangul | 3252 | 0.8% |
| Hebrew | 1720 | 0.4% |
| Thai | 1232 | 0.3% |
| Tamil | 555 | 0.1% |
| Other values (6) | 1108 | 0.3% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 50058 | |
| n | 41356 | |
| i | 41003 | |
| l | 34631 | |
| h | 31459 | |
| E | 31198 | |
| g | 30413 | |
| a | 22840 | 6.0% |
| 16973 | 4.5% | |
| , | 11666 | 3.1% |
| Other values (38) | 66496 |
None
| Value | Count | Frequency (%) |
| ç | 4441 | |
| ñ | 2412 | |
| ê | 591 | 5.7% |
| λ | 426 | 4.1% |
| ý | 284 | 2.7% |
| Č | 284 | 2.7% |
| ü | 247 | 2.4% |
| ά | 213 | 2.0% |
| κ | 213 | 2.0% |
| ι | 213 | 2.0% |
| Other values (11) | 1110 | 10.6% |
Cyrillic
| Value | Count | Frequency (%) |
| с | 3211 | |
| к | 1734 | |
| и | 1679 | |
| й | 1615 | |
| у | 1564 | |
| а | 113 | 1.1% |
| р | 87 | 0.8% |
| У | 53 | 0.5% |
| ї | 53 | 0.5% |
| н | 53 | 0.5% |
| Other values (12) | 292 | 2.8% |
CJK
| Value | Count | Frequency (%) |
| 語 | 1758 | |
| 本 | 1758 | |
| 日 | 1758 | |
| 话 | 1263 | |
| 州 | 946 | |
| 普 | 790 | |
| 通 | 790 | |
| 話 | 473 | 4.5% |
| 廣 | 473 | 4.5% |
| 广 | 473 | 4.5% |
Devanagari
| Value | Count | Frequency (%) |
| ि | 707 | |
| ह | 707 | |
| ी | 707 | |
| द | 707 | |
| ् | 707 | |
| न | 707 |
Hangul
| Value | Count | Frequency (%) |
| 국 | 542 | |
| 어 | 542 | |
| 선 | 542 | |
| 말 | 542 | |
| 한 | 542 | |
| 조 | 542 |
Arabic
| Value | Count | Frequency (%) |
| ا | 537 | |
| ر | 537 | |
| ل | 341 | |
| ع | 341 | |
| ب | 341 | |
| ي | 341 | |
| ة | 341 | |
| ی | 141 | 4.2% |
| ف | 141 | 4.2% |
| س | 141 | 4.2% |
| Other values (5) | 142 | 4.2% |
Hebrew
| Value | Count | Frequency (%) |
| ִ | 430 | |
| ת | 215 | |
| י | 215 | |
| ר | 215 | |
| ְ | 215 | |
| ב | 215 | |
| ע | 215 |
Thai
| Value | Count | Frequency (%) |
| า | 352 | |
| ท | 176 | |
| ย | 176 | |
| ไ | 176 | |
| ษ | 176 | |
| ภ | 176 |
Telugu
| Value | Count | Frequency (%) |
| ు | 136 | |
| గ | 68 | |
| ల | 68 | |
| ె | 68 | |
| త | 68 |
Tamil
| Value | Count | Frequency (%) |
| த | 111 | |
| ி | 111 | |
| ழ | 111 | |
| ் | 111 | |
| ம | 111 |
Bengali
| Value | Count | Frequency (%) |
| া | 94 | |
| ল | 47 | |
| ং | 47 | |
| ব | 47 |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ế | 61 | |
| ệ | 61 |
Georgian
| Value | Count | Frequency (%) |
| ლ | 33 | |
| ა | 33 | |
| ი | 33 | |
| უ | 33 | |
| თ | 33 | |
| რ | 33 | |
| ქ | 33 |
Gurmukhi
| Value | Count | Frequency (%) |
| ਾ | 18 | |
| ੀ | 18 | |
| ਬ | 18 | |
| ਜ | 18 | |
| ੰ | 18 | |
| ਪ | 18 |
IPA Ext
| Value | Count | Frequency (%) |
| ə | 4 |
status
Categorical
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| Released | |
|---|---|
| Rumored | 230 |
| Post Production | 97 |
| sin datos | 83 |
| In Production | 19 |
| Other values (2) | 14 |
Length
| Max length | 15 |
|---|---|
| Median length | 8 |
| Mean length | 8.0135305 |
| Min length | 7 |
Characters and Unicode
| Total characters | 363646 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Released |
|---|---|
| 2nd row | Released |
| 3rd row | Released |
| 4th row | Released |
| 5th row | Released |
Common Values
| Value | Count | Frequency (%) |
| Released | 44936 | |
| Rumored | 230 | 0.5% |
| Post Production | 97 | 0.2% |
| sin datos | 83 | 0.2% |
| In Production | 19 | < 0.1% |
| Planned | 13 | < 0.1% |
| Canceled | 1 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| released | 44936 | |
| rumored | 230 | 0.5% |
| production | 116 | 0.3% |
| post | 97 | 0.2% |
| sin | 83 | 0.2% |
| datos | 83 | 0.2% |
| in | 19 | < 0.1% |
| planned | 13 | < 0.1% |
| canceled | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 135053 | |
| d | 45379 | 12.5% |
| s | 45199 | 12.4% |
| R | 45166 | 12.4% |
| a | 45033 | 12.4% |
| l | 44950 | 12.4% |
| o | 642 | 0.2% |
| r | 346 | 0.1% |
| u | 346 | 0.1% |
| t | 296 | 0.1% |
| Other values (8) | 1236 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 318035 | |
| Uppercase Letter | 45412 | 12.5% |
| Space Separator | 199 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 135053 | |
| d | 45379 | 14.3% |
| s | 45199 | 14.2% |
| a | 45033 | 14.2% |
| l | 44950 | 14.1% |
| o | 642 | 0.2% |
| r | 346 | 0.1% |
| u | 346 | 0.1% |
| t | 296 | 0.1% |
| n | 245 | 0.1% |
| Other values (3) | 546 | 0.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 45166 | |
| P | 226 | 0.5% |
| I | 19 | < 0.1% |
| C | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 199 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 363447 | |
| Common | 199 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 135053 | |
| d | 45379 | 12.5% |
| s | 45199 | 12.4% |
| R | 45166 | 12.4% |
| a | 45033 | 12.4% |
| l | 44950 | 12.4% |
| o | 642 | 0.2% |
| r | 346 | 0.1% |
| u | 346 | 0.1% |
| t | 296 | 0.1% |
| Other values (7) | 1037 | 0.3% |
Common
| Value | Count | Frequency (%) |
| 199 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 363646 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 135053 | |
| d | 45379 | 12.5% |
| s | 45199 | 12.4% |
| R | 45166 | 12.4% |
| a | 45033 | 12.4% |
| l | 44950 | 12.4% |
| o | 642 | 0.2% |
| r | 346 | 0.1% |
| u | 346 | 0.1% |
| t | 296 | 0.1% |
| Other values (8) | 1236 | 0.3% |
tagline
Categorical
| Distinct | 20270 |
|---|---|
| Distinct (%) | 44.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| sin datos | |
|---|---|
| Based on a true story. | 7 |
| Trust no one. | 4 |
| Be careful what you wish for. | 4 |
| - | 4 |
| Other values (20265) |
Length
| Max length | 297 |
|---|---|
| Median length | 9 |
| Mean length | 26.080808 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1183521 |
|---|---|
| Distinct characters | 170 |
| Distinct categories | 17 ? |
| Distinct scripts | 6 ? |
| Distinct blocks | 10 ? |
Unique
| Unique | 20163 ? |
|---|---|
| Unique (%) | 44.4% |
Sample
| 1st row | sin datos |
|---|---|
| 2nd row | Roll the dice and unleash the excitement! |
| 3rd row | Still Yelling. Still Fighting. Still Ready for Love. |
| 4th row | Friends are the people who let you be yourself... and never let you forget it. |
| 5th row | Just When His World Is Back To Normal... He's In For The Surprise Of His Life! |
Common Values
| Value | Count | Frequency (%) |
| sin datos | 24981 | |
| Based on a true story. | 7 | < 0.1% |
| Trust no one. | 4 | < 0.1% |
| Be careful what you wish for. | 4 | < 0.1% |
| - | 4 | < 0.1% |
| How far would you go? | 3 | < 0.1% |
| Drama | 3 | < 0.1% |
| Classic Albums | 3 | < 0.1% |
| There are two sides to every love story. | 3 | < 0.1% |
| There is no turning back | 3 | < 0.1% |
| Other values (20260) | 20364 |
Length
| Value | Count | Frequency (%) |
| sin | 25008 | 11.2% |
| datos | 24981 | 11.2% |
| the | 10998 | 4.9% |
| a | 6815 | 3.0% |
| of | 4404 | 2.0% |
| to | 3584 | 1.6% |
| is | 2796 | 1.2% |
| in | 2693 | 1.2% |
| and | 2682 | 1.2% |
| you | 2389 | 1.1% |
| Other values (15101) | 137548 |
Most occurring characters
| Value | Count | Frequency (%) |
| 178667 | ||
| e | 94412 | 8.0% |
| s | 92322 | 7.8% |
| t | 82248 | 6.9% |
| o | 81547 | 6.9% |
| a | 76454 | 6.5% |
| n | 72479 | 6.1% |
| i | 71017 | 6.0% |
| d | 48472 | 4.1% |
| r | 44992 | 3.8% |
| Other values (160) | 340911 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 880327 | |
| Space Separator | 178667 | 15.1% |
| Uppercase Letter | 74991 | 6.3% |
| Other Punctuation | 44585 | 3.8% |
| Decimal Number | 2687 | 0.2% |
| Dash Punctuation | 1944 | 0.2% |
| Final Punctuation | 98 | < 0.1% |
| Open Punctuation | 56 | < 0.1% |
| Close Punctuation | 55 | < 0.1% |
| Currency Symbol | 37 | < 0.1% |
| Other values (7) | 74 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 94412 | |
| s | 92322 | |
| t | 82248 | |
| o | 81547 | |
| a | 76454 | |
| n | 72479 | |
| i | 71017 | 8.1% |
| d | 48472 | 5.5% |
| r | 44992 | 5.1% |
| h | 37172 | 4.2% |
| Other values (43) | 179212 |
Other Letter
| Value | Count | Frequency (%) |
| வ | 1 | 2.9% |
| ன | 1 | 2.9% |
| 成 | 1 | 2.9% |
| 劇 | 1 | 2.9% |
| 熟 | 1 | 2.9% |
| த | 1 | 2.9% |
| ஆ | 1 | 2.9% |
| 時 | 1 | 2.9% |
| 舞 | 1 | 2.9% |
| 場 | 1 | 2.9% |
| Other values (24) | 24 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 10009 | 13.3% |
| A | 6874 | 9.2% |
| S | 5652 | 7.5% |
| H | 4402 | 5.9% |
| I | 4387 | 5.9% |
| E | 4306 | 5.7% |
| W | 3681 | 4.9% |
| O | 3477 | 4.6% |
| N | 3195 | 4.3% |
| L | 3194 | 4.3% |
| Other values (20) | 25814 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 26647 | |
| ! | 5784 | 13.0% |
| ' | 5674 | 12.7% |
| , | 4226 | 9.5% |
| ? | 1161 | 2.6% |
| " | 582 | 1.3% |
| … | 148 | 0.3% |
| : | 138 | 0.3% |
| & | 83 | 0.2% |
| * | 42 | 0.1% |
| Other values (7) | 100 | 0.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 802 | |
| 1 | 516 | |
| 2 | 299 | 11.1% |
| 3 | 208 | 7.7% |
| 9 | 208 | 7.7% |
| 5 | 168 | 6.3% |
| 4 | 140 | 5.2% |
| 7 | 121 | 4.5% |
| 6 | 121 | 4.5% |
| 8 | 104 | 3.9% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 5 | |
| = | 5 | |
| | | 2 | 14.3% |
| ~ | 1 | 7.1% |
| − | 1 | 7.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1927 | |
| – | 9 | 0.5% |
| — | 8 | 0.4% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 82 | |
| ” | 15 | 15.3% |
| » | 1 | 1.0% |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 14 | |
| ‘ | 4 | 21.1% |
| « | 1 | 5.3% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 49 | |
| [ | 7 | 12.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 48 | |
| ] | 7 | 12.7% |
Other Number
| Value | Count | Frequency (%) |
| ½ | 2 | |
| ² | 1 |
Modifier Letter
| Value | Count | Frequency (%) |
| ˌ | 1 | |
| ˈ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 178667 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 37 |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ் | 1 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 955318 | |
| Common | 228168 | 19.3% |
| Han | 21 | < 0.1% |
| Tamil | 5 | < 0.1% |
| Hiragana | 5 | < 0.1% |
| Katakana | 4 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 94412 | 9.9% |
| s | 92322 | 9.7% |
| t | 82248 | 8.6% |
| o | 81547 | 8.5% |
| a | 76454 | 8.0% |
| n | 72479 | 7.6% |
| i | 71017 | 7.4% |
| d | 48472 | 5.1% |
| r | 44992 | 4.7% |
| h | 37172 | 3.9% |
| Other values (73) | 254203 |
Common
| Value | Count | Frequency (%) |
| 178667 | ||
| . | 26647 | 11.7% |
| ! | 5784 | 2.5% |
| ' | 5674 | 2.5% |
| , | 4226 | 1.9% |
| - | 1927 | 0.8% |
| ? | 1161 | 0.5% |
| 0 | 802 | 0.4% |
| " | 582 | 0.3% |
| 1 | 516 | 0.2% |
| Other values (42) | 2182 | 1.0% |
Han
| Value | Count | Frequency (%) |
| 成 | 1 | 4.8% |
| 劇 | 1 | 4.8% |
| 熟 | 1 | 4.8% |
| 時 | 1 | 4.8% |
| 舞 | 1 | 4.8% |
| 場 | 1 | 4.8% |
| 版 | 1 | 4.8% |
| 蜜 | 1 | 4.8% |
| 最 | 1 | 4.8% |
| 后 | 1 | 4.8% |
| Other values (11) | 11 |
Tamil
| Value | Count | Frequency (%) |
| வ | 1 | |
| ் | 1 | |
| ன | 1 | |
| த | 1 | |
| ஆ | 1 |
Hiragana
| Value | Count | Frequency (%) |
| は | 1 | |
| し | 1 | |
| て | 1 | |
| い | 1 | |
| る | 1 |
Katakana
| Value | Count | Frequency (%) |
| ク | 1 | |
| ラ | 1 | |
| ナ | 1 | |
| ド | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1183091 | |
| Punctuation | 280 | < 0.1% |
| None | 110 | < 0.1% |
| CJK | 21 | < 0.1% |
| Tamil | 5 | < 0.1% |
| Hiragana | 5 | < 0.1% |
| Katakana | 4 | < 0.1% |
| IPA Ext | 2 | < 0.1% |
| Modifier Letters | 2 | < 0.1% |
| Math Operators | 1 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 178667 | ||
| e | 94412 | 8.0% |
| s | 92322 | 7.8% |
| t | 82248 | 7.0% |
| o | 81547 | 6.9% |
| a | 76454 | 6.5% |
| n | 72479 | 6.1% |
| i | 71017 | 6.0% |
| d | 48472 | 4.1% |
| r | 44992 | 3.8% |
| Other values (78) | 340481 |
Punctuation
| Value | Count | Frequency (%) |
| … | 148 | |
| ’ | 82 | |
| ” | 15 | 5.4% |
| “ | 14 | 5.0% |
| – | 9 | 3.2% |
| — | 8 | 2.9% |
| ‘ | 4 | 1.4% |
None
| Value | Count | Frequency (%) |
| é | 18 | |
| ä | 16 | |
| ö | 8 | 7.3% |
| á | 6 | 5.5% |
| ó | 6 | 5.5% |
| ü | 5 | 4.5% |
| í | 5 | 4.5% |
| ı | 5 | 4.5% |
| · | 4 | 3.6% |
| ć | 3 | 2.7% |
| Other values (26) | 34 |
IPA Ext
| Value | Count | Frequency (%) |
| ə | 2 |
Tamil
| Value | Count | Frequency (%) |
| வ | 1 | |
| ் | 1 | |
| ன | 1 | |
| த | 1 | |
| ஆ | 1 |
CJK
| Value | Count | Frequency (%) |
| 成 | 1 | 4.8% |
| 劇 | 1 | 4.8% |
| 熟 | 1 | 4.8% |
| 時 | 1 | 4.8% |
| 舞 | 1 | 4.8% |
| 場 | 1 | 4.8% |
| 版 | 1 | 4.8% |
| 蜜 | 1 | 4.8% |
| 最 | 1 | 4.8% |
| 后 | 1 | 4.8% |
| Other values (11) | 11 |
Katakana
| Value | Count | Frequency (%) |
| ク | 1 | |
| ラ | 1 | |
| ナ | 1 | |
| ド | 1 |
Modifier Letters
| Value | Count | Frequency (%) |
| ˌ | 1 | |
| ˈ | 1 |
Hiragana
| Value | Count | Frequency (%) |
| は | 1 | |
| し | 1 | |
| て | 1 | |
| い | 1 | |
| る | 1 |
Math Operators
| Value | Count | Frequency (%) |
| − | 1 |
title
Categorical
HIGH CARDINALITY  UNIFORM 
| Distinct | 42197 |
|---|---|
| Distinct (%) | 93.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| Cinderella | 11 |
|---|---|
| Hamlet | 9 |
| Alice in Wonderland | 9 |
| Les Misérables | 8 |
| Beauty and the Beast | 8 |
| Other values (42192) |
Length
| Max length | 105 |
|---|---|
| Median length | 79 |
| Mean length | 16.701272 |
| Min length | 1 |
Characters and Unicode
| Total characters | 757887 |
|---|---|
| Distinct characters | 287 |
| Distinct categories | 17 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 12 ? |
Unique
| Unique | 39869 ? |
|---|---|
| Unique (%) | 87.9% |
Sample
| 1st row | Toy Story |
|---|---|
| 2nd row | Jumanji |
| 3rd row | Grumpier Old Men |
| 4th row | Waiting to Exhale |
| 5th row | Father of the Bride Part II |
Common Values
| Value | Count | Frequency (%) |
| Cinderella | 11 | < 0.1% |
| Hamlet | 9 | < 0.1% |
| Alice in Wonderland | 9 | < 0.1% |
| Les Misérables | 8 | < 0.1% |
| Beauty and the Beast | 8 | < 0.1% |
| Treasure Island | 7 | < 0.1% |
| The Three Musketeers | 7 | < 0.1% |
| Blackout | 7 | < 0.1% |
| A Christmas Carol | 7 | < 0.1% |
| Aftermath | 6 | < 0.1% |
| Other values (42187) | 45300 |
Length
| Value | Count | Frequency (%) |
| the | 14555 | 10.7% |
| of | 4930 | 3.6% |
| a | 2241 | 1.6% |
| in | 1693 | 1.2% |
| and | 1631 | 1.2% |
| to | 1054 | 0.8% |
| 757 | 0.6% | |
| man | 665 | 0.5% |
| love | 664 | 0.5% |
| for | 601 | 0.4% |
| Other values (24354) | 107396 |
Most occurring characters
| Value | Count | Frequency (%) |
| 90830 | 12.0% | |
| e | 76251 | 10.1% |
| a | 48943 | 6.5% |
| o | 45674 | 6.0% |
| n | 40820 | 5.4% |
| r | 40018 | 5.3% |
| i | 39767 | 5.2% |
| t | 36725 | 4.8% |
| s | 29525 | 3.9% |
| h | 28516 | 3.8% |
| Other values (277) | 280818 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 534158 | |
| Uppercase Letter | 117265 | 15.5% |
| Space Separator | 90830 | 12.0% |
| Other Punctuation | 10489 | 1.4% |
| Decimal Number | 3850 | 0.5% |
| Dash Punctuation | 981 | 0.1% |
| Close Punctuation | 87 | < 0.1% |
| Open Punctuation | 85 | < 0.1% |
| Final Punctuation | 38 | < 0.1% |
| Other Letter | 25 | < 0.1% |
| Other values (7) | 79 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 76251 | |
| a | 48943 | |
| o | 45674 | 8.6% |
| n | 40820 | 7.6% |
| r | 40018 | 7.5% |
| i | 39767 | 7.4% |
| t | 36725 | 6.9% |
| s | 29525 | 5.5% |
| h | 28516 | 5.3% |
| l | 25924 | 4.9% |
| Other values (121) | 121995 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 16019 | |
| S | 10336 | 8.8% |
| M | 8031 | 6.8% |
| B | 7659 | 6.5% |
| C | 7165 | 6.1% |
| A | 6785 | 5.8% |
| D | 6335 | 5.4% |
| L | 5872 | 5.0% |
| H | 5170 | 4.4% |
| W | 5166 | 4.4% |
| Other values (65) | 38727 |
Other Letter
| Value | Count | Frequency (%) |
| چ | 2 | 8.0% |
| ه | 2 | 8.0% |
| ی | 2 | 8.0% |
| ک | 2 | 8.0% |
| 傳 | 1 | 4.0% |
| 空 | 1 | 4.0% |
| 時 | 1 | 4.0% |
| 狗 | 1 | 4.0% |
| 貓 | 1 | 4.0% |
| ª | 1 | 4.0% |
| Other values (11) | 11 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 3717 | |
| ' | 2505 | |
| . | 1603 | |
| , | 1134 | 10.8% |
| ! | 647 | 6.2% |
| & | 458 | 4.4% |
| ? | 269 | 2.6% |
| / | 79 | 0.8% |
| * | 19 | 0.2% |
| # | 13 | 0.1% |
| Other values (8) | 45 | 0.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 861 | |
| 1 | 697 | |
| 0 | 616 | |
| 3 | 482 | |
| 9 | 230 | 6.0% |
| 4 | 229 | 5.9% |
| 5 | 225 | 5.8% |
| 7 | 193 | 5.0% |
| 8 | 161 | 4.2% |
| 6 | 156 | 4.1% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 17 | |
| × | 3 | 12.5% |
| ∞ | 1 | 4.2% |
| = | 1 | 4.2% |
| → | 1 | 4.2% |
| − | 1 | 4.2% |
Other Number
| Value | Count | Frequency (%) |
| ½ | 12 | |
| ² | 3 | 15.8% |
| ³ | 2 | 10.5% |
| ⅓ | 1 | 5.3% |
| ⁴ | 1 | 5.3% |
Other Symbol
| Value | Count | Frequency (%) |
| ° | 3 | |
| ☆ | 2 | |
| ™ | 1 | 12.5% |
| ♡ | 1 | 12.5% |
| № | 1 | 12.5% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 18 | |
| ¢ | 2 | 9.5% |
| £ | 1 | 4.8% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 966 | |
| – | 15 | 1.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 82 | |
| ] | 5 | 5.7% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 80 | |
| [ | 5 | 5.9% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 37 | |
| ” | 1 | 2.6% |
Initial Punctuation
| Value | Count | Frequency (%) |
| ‘ | 1 | |
| “ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 90830 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 3 |
Format
| Value | Count | Frequency (%) |
| | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 650908 | |
| Common | 106439 | 14.0% |
| Cyrillic | 346 | < 0.1% |
| Greek | 170 | < 0.1% |
| Arabic | 11 | < 0.1% |
| Katakana | 8 | < 0.1% |
| Han | 5 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 76251 | 11.7% |
| a | 48943 | 7.5% |
| o | 45674 | 7.0% |
| n | 40820 | 6.3% |
| r | 40018 | 6.1% |
| i | 39767 | 6.1% |
| t | 36725 | 5.6% |
| s | 29525 | 4.5% |
| h | 28516 | 4.4% |
| l | 25924 | 4.0% |
| Other values (107) | 238745 |
Common
| Value | Count | Frequency (%) |
| 90830 | ||
| : | 3717 | 3.5% |
| ' | 2505 | 2.4% |
| . | 1603 | 1.5% |
| , | 1134 | 1.1% |
| - | 966 | 0.9% |
| 2 | 861 | 0.8% |
| 1 | 697 | 0.7% |
| ! | 647 | 0.6% |
| 0 | 616 | 0.6% |
| Other values (50) | 2863 | 2.7% |
Cyrillic
| Value | Count | Frequency (%) |
| е | 32 | 9.2% |
| о | 32 | 9.2% |
| а | 29 | 8.4% |
| н | 24 | 6.9% |
| и | 23 | 6.6% |
| р | 22 | 6.4% |
| к | 17 | 4.9% |
| с | 15 | 4.3% |
| л | 14 | 4.0% |
| в | 14 | 4.0% |
| Other values (38) | 124 |
Greek
| Value | Count | Frequency (%) |
| α | 20 | 11.8% |
| ι | 14 | 8.2% |
| ο | 14 | 8.2% |
| τ | 9 | 5.3% |
| ά | 8 | 4.7% |
| λ | 8 | 4.7% |
| ρ | 8 | 4.7% |
| ν | 7 | 4.1% |
| ε | 6 | 3.5% |
| ς | 6 | 3.5% |
| Other values (32) | 70 |
Katakana
| Value | Count | Frequency (%) |
| テ | 1 | |
| ポ | 1 | |
| ィ | 1 | |
| ス | 1 | |
| タ | 1 | |
| ン | 1 | |
| ァ | 1 | |
| フ | 1 |
Arabic
| Value | Count | Frequency (%) |
| چ | 2 | |
| ه | 2 | |
| ی | 2 | |
| ک | 2 | |
| س | 1 | |
| ا | 1 | |
| ج | 1 |
Han
| Value | Count | Frequency (%) |
| 傳 | 1 | |
| 空 | 1 | |
| 時 | 1 | |
| 狗 | 1 | |
| 貓 | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 756322 | |
| None | 1124 | 0.1% |
| Cyrillic | 346 | < 0.1% |
| Punctuation | 62 | < 0.1% |
| Arabic | 11 | < 0.1% |
| Katakana | 8 | < 0.1% |
| CJK | 5 | < 0.1% |
| Misc Symbols | 3 | < 0.1% |
| Letterlike Symbols | 2 | < 0.1% |
| Math Operators | 2 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 90830 | 12.0% | |
| e | 76251 | 10.1% |
| a | 48943 | 6.5% |
| o | 45674 | 6.0% |
| n | 40820 | 5.4% |
| r | 40018 | 5.3% |
| i | 39767 | 5.3% |
| t | 36725 | 4.9% |
| s | 29525 | 3.9% |
| h | 28516 | 3.8% |
| Other values (76) | 279253 |
None
| Value | Count | Frequency (%) |
| é | 218 | |
| ä | 127 | 11.3% |
| ö | 55 | 4.9% |
| è | 53 | 4.7% |
| ô | 44 | 3.9% |
| ü | 39 | 3.5% |
| ó | 37 | 3.3% |
| á | 35 | 3.1% |
| ı | 35 | 3.1% |
| í | 33 | 2.9% |
| Other values (108) | 448 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 37 | |
| – | 15 | |
| … | 5 | 8.1% |
| | 2 | 3.2% |
| ‘ | 1 | 1.6% |
| ” | 1 | 1.6% |
| “ | 1 | 1.6% |
Cyrillic
| Value | Count | Frequency (%) |
| е | 32 | 9.2% |
| о | 32 | 9.2% |
| а | 29 | 8.4% |
| н | 24 | 6.9% |
| и | 23 | 6.6% |
| р | 22 | 6.4% |
| к | 17 | 4.9% |
| с | 15 | 4.3% |
| л | 14 | 4.0% |
| в | 14 | 4.0% |
| Other values (38) | 124 |
Arabic
| Value | Count | Frequency (%) |
| چ | 2 | |
| ه | 2 | |
| ی | 2 | |
| ک | 2 | |
| س | 1 | |
| ا | 1 | |
| ج | 1 |
Misc Symbols
| Value | Count | Frequency (%) |
| ☆ | 2 | |
| ♡ | 1 |
CJK
| Value | Count | Frequency (%) |
| 傳 | 1 | |
| 空 | 1 | |
| 時 | 1 | |
| 狗 | 1 | |
| 貓 | 1 |
Number Forms
| Value | Count | Frequency (%) |
| ⅓ | 1 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| ™ | 1 | |
| № | 1 |
Math Operators
| Value | Count | Frequency (%) |
| ∞ | 1 | |
| − | 1 |
Katakana
| Value | Count | Frequency (%) |
| テ | 1 | |
| ポ | 1 | |
| ィ | 1 | |
| ス | 1 | |
| タ | 1 | |
| ン | 1 | |
| ァ | 1 | |
| フ | 1 |
Arrows
| Value | Count | Frequency (%) |
| → | 1 |
vote_average
Real number (ℝ)
| Distinct | 92 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.6236982 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 2950 |
| Zeros (%) | 6.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 5 |
| median | 6 |
| Q3 | 6.8 |
| 95-th percentile | 7.8 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 1.8 |
Descriptive statistics
| Standard deviation | 1.915905 |
|---|---|
| Coefficient of variation (CV) | 0.34068417 |
| Kurtosis | 2.5398342 |
| Mean | 5.6236982 |
| Median Absolute Deviation (MAD) | 0.9 |
| Skewness | -1.5243101 |
| Sum | 255197.8 |
| Variance | 3.6706919 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2950 | 6.5% |
| 6 | 2462 | 5.4% |
| 5 | 1998 | 4.4% |
| 7 | 1883 | 4.1% |
| 6.5 | 1722 | 3.8% |
| 6.3 | 1603 | 3.5% |
| 5.5 | 1381 | 3.0% |
| 5.8 | 1369 | 3.0% |
| 6.4 | 1350 | 3.0% |
| 6.7 | 1342 | 3.0% |
| Other values (82) | 27319 |
| Value | Count | Frequency (%) |
| 0 | 2950 | |
| 0.5 | 13 | < 0.1% |
| 0.7 | 1 | < 0.1% |
| 1 | 103 | 0.2% |
| 1.1 | 1 | < 0.1% |
| 1.2 | 4 | < 0.1% |
| 1.3 | 13 | < 0.1% |
| 1.4 | 5 | < 0.1% |
| 1.5 | 30 | 0.1% |
| 1.6 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 10 | 185 | |
| 9.8 | 1 | < 0.1% |
| 9.6 | 1 | < 0.1% |
| 9.5 | 18 | < 0.1% |
| 9.4 | 3 | < 0.1% |
| 9.3 | 18 | < 0.1% |
| 9.2 | 4 | < 0.1% |
| 9.1 | 2 | < 0.1% |
| 9 | 158 | |
| 8.9 | 7 | < 0.1% |
vote_count
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 1820 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 110.08916 |
| Minimum | 0 |
|---|---|
| Maximum | 14075 |
| Zeros | 2852 |
| Zeros (%) | 6.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 10 |
| Q3 | 34 |
| 95-th percentile | 434 |
| Maximum | 14075 |
| Range | 14075 |
| Interquartile range (IQR) | 31 |
Descriptive statistics
| Standard deviation | 491.72745 |
|---|---|
| Coefficient of variation (CV) | 4.4666292 |
| Kurtosis | 150.93835 |
| Mean | 110.08916 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 10.441119 |
| Sum | 4995736 |
| Variance | 241795.89 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 3242 | 7.1% |
| 2 | 3127 | 6.9% |
| 0 | 2852 | 6.3% |
| 3 | 2785 | 6.1% |
| 4 | 2478 | 5.5% |
| 5 | 2097 | 4.6% |
| 6 | 1747 | 3.8% |
| 7 | 1570 | 3.5% |
| 8 | 1359 | 3.0% |
| 9 | 1194 | 2.6% |
| Other values (1810) | 22928 |
| Value | Count | Frequency (%) |
| 0 | 2852 | |
| 1 | 3242 | |
| 2 | 3127 | |
| 3 | 2785 | |
| 4 | 2478 | |
| 5 | 2097 | |
| 6 | 1747 | |
| 7 | 1570 | |
| 8 | 1359 | |
| 9 | 1194 | 2.6% |
| Value | Count | Frequency (%) |
| 14075 | 1 | |
| 12269 | 1 | |
| 12114 | 1 | |
| 12000 | 1 | |
| 11444 | 1 | |
| 11187 | 1 | |
| 10297 | 1 | |
| 10014 | 1 | |
| 9678 | 1 | |
| 9634 | 1 |
release_year
Real number (ℝ)
| Distinct | 135 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1991.8822 |
| Minimum | 1874 |
|---|---|
| Maximum | 2020 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.7 MiB |
Quantile statistics
| Minimum | 1874 |
|---|---|
| 5-th percentile | 1941 |
| Q1 | 1978 |
| median | 2001 |
| Q3 | 2010 |
| 95-th percentile | 2015 |
| Maximum | 2020 |
| Range | 146 |
| Interquartile range (IQR) | 32 |
Descriptive statistics
| Standard deviation | 24.054986 |
|---|---|
| Coefficient of variation (CV) | 0.01207651 |
| Kurtosis | 0.84032964 |
| Mean | 1991.8822 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -1.2249397 |
| Sum | 90389624 |
| Variance | 578.64235 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2014 | 1975 | 4.4% |
| 2015 | 1905 | 4.2% |
| 2013 | 1889 | 4.2% |
| 2012 | 1723 | 3.8% |
| 2011 | 1667 | 3.7% |
| 2016 | 1604 | 3.5% |
| 2009 | 1586 | 3.5% |
| 2010 | 1501 | 3.3% |
| 2008 | 1473 | 3.2% |
| 2007 | 1320 | 2.9% |
| Other values (125) | 28736 |
| Value | Count | Frequency (%) |
| 1874 | 1 | < 0.1% |
| 1878 | 1 | < 0.1% |
| 1883 | 1 | < 0.1% |
| 1887 | 1 | < 0.1% |
| 1888 | 2 | < 0.1% |
| 1890 | 5 | < 0.1% |
| 1891 | 6 | |
| 1892 | 3 | < 0.1% |
| 1893 | 1 | < 0.1% |
| 1894 | 13 |
| Value | Count | Frequency (%) |
| 2020 | 1 | < 0.1% |
| 2018 | 5 | < 0.1% |
| 2017 | 532 | 1.2% |
| 2016 | 1604 | |
| 2015 | 1905 | |
| 2014 | 1975 | |
| 2013 | 1889 | |
| 2012 | 1723 | |
| 2011 | 1667 | |
| 2010 | 1501 |
return
Real number (ℝ)
SKEWED  ZEROS 
| Distinct | 5226 |
|---|---|
| Distinct (%) | 11.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 659.99862 |
| Minimum | 0 |
|---|---|
| Maximum | 12396383 |
| Zeros | 40005 |
| Zeros (%) | 88.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2.5316 |
| Maximum | 12396383 |
| Range | 12396383 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 74690.825 |
|---|---|
| Coefficient of variation (CV) | 113.16815 |
| Kurtosis | 20674.324 |
| Mean | 659.99862 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 138.3341 |
| Sum | 29950077 |
| Variance | 5.5787194 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 40005 | |
| 1 | 20 | < 0.1% |
| 2 | 12 | < 0.1% |
| 4 | 10 | < 0.1% |
| 5 | 8 | < 0.1% |
| 1.333333333 | 7 | < 0.1% |
| 2.5 | 7 | < 0.1% |
| 3 | 7 | < 0.1% |
| 1.5 | 6 | < 0.1% |
| 4.666666667 | 4 | < 0.1% |
| Other values (5216) | 5293 | 11.7% |
| Value | Count | Frequency (%) |
| 0 | 40005 | |
| 5.217391304 × 10-7 | 1 | < 0.1% |
| 7.5 × 10-7 | 1 | < 0.1% |
| 9.375 × 10-7 | 1 | < 0.1% |
| 1.499133126 × 10-6 | 1 | < 0.1% |
| 1.8 × 10-6 | 1 | < 0.1% |
| 1.916666667 × 10-6 | 1 | < 0.1% |
| 3.5 × 10-6 | 1 | < 0.1% |
| 4 × 10-6 | 1 | < 0.1% |
| 5.111111111 × 10-6 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 12396383 | 1 | |
| 8500000 | 1 | |
| 4197476.625 | 1 | |
| 2755584 | 1 | |
| 1018619.283 | 1 | |
| 1000000 | 1 | |
| 26881.72043 | 1 | |
| 12890.38667 | 1 | |
| 5330.33945 | 1 | |
| 4133.333333 | 1 |
name_collection
Categorical
HIGH CARDINALITY  IMBALANCE 
| Distinct | 1696 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| sin datos | |
|---|---|
| The Bowery Boys | 29 |
| Totò Collection | 27 |
| James Bond Collection | 26 |
| Zatôichi: The Blind Swordsman | 26 |
| Other values (1691) |
Length
| Max length | 54 |
|---|---|
| Median length | 9 |
| Mean length | 10.469248 |
| Min length | 3 |
Characters and Unicode
| Total characters | 475084 |
|---|---|
| Distinct characters | 166 |
| Distinct categories | 12 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 8 ? |
Unique
| Unique | 390 ? |
|---|---|
| Unique (%) | 0.9% |
Sample
| 1st row | Toy Story Collection |
|---|---|
| 2nd row | sin datos |
| 3rd row | Grumpy Old Men Collection |
| 4th row | sin datos |
| 5th row | Father of the Bride Collection |
Common Values
| Value | Count | Frequency (%) |
| sin datos | 40891 | |
| The Bowery Boys | 29 | 0.1% |
| Totò Collection | 27 | 0.1% |
| James Bond Collection | 26 | 0.1% |
| Zatôichi: The Blind Swordsman | 26 | 0.1% |
| The Carry On Collection | 25 | 0.1% |
| Pokémon Collection | 22 | < 0.1% |
| Charlie Chan (Sidney Toler) Collection | 21 | < 0.1% |
| Godzilla (Showa) Collection | 16 | < 0.1% |
| Dragon Ball Z (Movie) Collection | 15 | < 0.1% |
| Other values (1686) | 4281 | 9.4% |
Length
| Value | Count | Frequency (%) |
| sin | 40893 | |
| datos | 40891 | |
| collection | 3743 | 3.9% |
| the | 1146 | 1.2% |
| of | 230 | 0.2% |
| series | 147 | 0.2% |
| 139 | 0.1% | |
| trilogy | 87 | 0.1% |
| and | 84 | 0.1% |
| a | 62 | 0.1% |
| Other values (2408) | 9144 | 9.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 84370 | |
| o | 52005 | |
| 51188 | ||
| i | 48450 | |
| n | 48294 | |
| t | 47379 | |
| a | 45350 | |
| d | 42284 | |
| e | 10450 | 2.2% |
| l | 10200 | 2.1% |
| Other values (156) | 35114 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 408231 | |
| Space Separator | 51188 | 10.8% |
| Uppercase Letter | 13885 | 2.9% |
| Other Punctuation | 576 | 0.1% |
| Open Punctuation | 335 | 0.1% |
| Close Punctuation | 335 | 0.1% |
| Decimal Number | 321 | 0.1% |
| Dash Punctuation | 162 | < 0.1% |
| Other Letter | 37 | < 0.1% |
| Final Punctuation | 9 | < 0.1% |
| Other values (2) | 5 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 84370 | |
| o | 52005 | |
| i | 48450 | |
| n | 48294 | |
| t | 47379 | |
| a | 45350 | |
| d | 42284 | |
| e | 10450 | 2.6% |
| l | 10200 | 2.5% |
| c | 4845 | 1.2% |
| Other values (69) | 14604 | 3.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 4474 | |
| T | 1527 | 11.0% |
| S | 1063 | 7.7% |
| B | 682 | 4.9% |
| M | 630 | 4.5% |
| A | 509 | 3.7% |
| D | 505 | 3.6% |
| H | 462 | 3.3% |
| P | 432 | 3.1% |
| G | 417 | 3.0% |
| Other values (33) | 3184 |
Other Letter
| Value | Count | Frequency (%) |
| つ | 3 | 8.1% |
| は | 3 | 8.1% |
| よ | 3 | 8.1% |
| シ | 3 | 8.1% |
| リ | 3 | 8.1% |
| ら | 3 | 8.1% |
| い | 3 | 8.1% |
| ズ | 3 | 8.1% |
| 男 | 3 | 8.1% |
| 식 | 2 | 5.4% |
| Other values (4) | 8 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 172 | |
| ' | 107 | |
| : | 99 | |
| , | 79 | |
| & | 52 | 9.0% |
| ! | 35 | 6.1% |
| / | 21 | 3.6% |
| ? | 4 | 0.7% |
| * | 4 | 0.7% |
| … | 3 | 0.5% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 80 | |
| 9 | 64 | |
| 3 | 54 | |
| 0 | 51 | |
| 2 | 21 | 6.5% |
| 8 | 13 | 4.0% |
| 5 | 12 | 3.7% |
| 7 | 11 | 3.4% |
| 6 | 10 | 3.1% |
| 4 | 5 | 1.6% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 330 | |
| [ | 5 | 1.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 330 | |
| ] | 5 | 1.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 160 | |
| – | 2 | 1.2% |
Space Separator
| Value | Count | Frequency (%) |
| 51188 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 9 |
Modifier Letter
| Value | Count | Frequency (%) |
| ー | 3 |
Other Number
| Value | Count | Frequency (%) |
| ½ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 421702 | |
| Common | 52931 | 11.1% |
| Cyrillic | 414 | 0.1% |
| Hiragana | 15 | < 0.1% |
| Hangul | 10 | < 0.1% |
| Katakana | 9 | < 0.1% |
| Han | 3 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 84370 | |
| o | 52005 | |
| i | 48450 | |
| n | 48294 | |
| t | 47379 | |
| a | 45350 | |
| d | 42284 | |
| e | 10450 | 2.5% |
| l | 10200 | 2.4% |
| c | 4845 | 1.1% |
| Other values (70) | 28075 | 6.7% |
Cyrillic
| Value | Count | Frequency (%) |
| л | 48 | 11.6% |
| и | 41 | 9.9% |
| о | 37 | 8.9% |
| к | 30 | 7.2% |
| е | 27 | 6.5% |
| я | 25 | 6.0% |
| а | 17 | 4.1% |
| ц | 16 | 3.9% |
| К | 16 | 3.9% |
| р | 14 | 3.4% |
| Other values (32) | 143 |
Common
| Value | Count | Frequency (%) |
| 51188 | ||
| ( | 330 | 0.6% |
| ) | 330 | 0.6% |
| . | 172 | 0.3% |
| - | 160 | 0.3% |
| ' | 107 | 0.2% |
| : | 99 | 0.2% |
| 1 | 80 | 0.2% |
| , | 79 | 0.1% |
| 9 | 64 | 0.1% |
| Other values (20) | 322 | 0.6% |
Hiragana
| Value | Count | Frequency (%) |
| つ | 3 | |
| は | 3 | |
| よ | 3 | |
| ら | 3 | |
| い | 3 |
Hangul
| Value | Count | Frequency (%) |
| 식 | 2 | |
| 객 | 2 | |
| 시 | 2 | |
| 리 | 2 | |
| 즈 | 2 |
Katakana
| Value | Count | Frequency (%) |
| シ | 3 | |
| リ | 3 | |
| ズ | 3 |
Han
| Value | Count | Frequency (%) |
| 男 | 3 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 474370 | |
| Cyrillic | 414 | 0.1% |
| None | 246 | 0.1% |
| Hiragana | 15 | < 0.1% |
| Punctuation | 14 | < 0.1% |
| Katakana | 12 | < 0.1% |
| Hangul | 10 | < 0.1% |
| CJK | 3 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 84370 | |
| o | 52005 | |
| 51188 | ||
| i | 48450 | |
| n | 48294 | |
| t | 47379 | |
| a | 45350 | |
| d | 42284 | |
| e | 10450 | 2.2% |
| l | 10200 | 2.2% |
| Other values (67) | 34400 |
Cyrillic
| Value | Count | Frequency (%) |
| л | 48 | 11.6% |
| и | 41 | 9.9% |
| о | 37 | 8.9% |
| к | 30 | 7.2% |
| е | 27 | 6.5% |
| я | 25 | 6.0% |
| а | 17 | 4.1% |
| ц | 16 | 3.9% |
| К | 16 | 3.9% |
| р | 14 | 3.4% |
| Other values (32) | 143 |
None
| Value | Count | Frequency (%) |
| é | 45 | |
| ä | 40 | |
| ô | 35 | |
| ò | 28 | |
| ö | 19 | |
| ó | 14 | 5.7% |
| ı | 14 | 5.7% |
| í | 9 | 3.7% |
| á | 4 | 1.6% |
| İ | 4 | 1.6% |
| Other values (19) | 34 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 9 | |
| … | 3 | 21.4% |
| – | 2 | 14.3% |
Hiragana
| Value | Count | Frequency (%) |
| つ | 3 | |
| は | 3 | |
| よ | 3 | |
| ら | 3 | |
| い | 3 |
Katakana
| Value | Count | Frequency (%) |
| シ | 3 | |
| リ | 3 | |
| ー | 3 | |
| ズ | 3 |
CJK
| Value | Count | Frequency (%) |
| 男 | 3 |
Hangul
| Value | Count | Frequency (%) |
| 식 | 2 | |
| 객 | 2 | |
| 시 | 2 | |
| 리 | 2 | |
| 즈 | 2 |
| budget | id | revenue | runtime | vote_average | vote_count | release_year | return | original_language | status | |
|---|---|---|---|---|---|---|---|---|---|---|
| budget | 1.000 | -0.255 | 0.644 | 0.229 | 0.072 | 0.484 | 0.141 | 0.145 | 0.000 | 0.000 |
| id | -0.255 | 1.000 | -0.277 | -0.213 | -0.149 | -0.433 | 0.392 | -0.202 | 0.071 | 0.053 |
| revenue | 0.644 | -0.277 | 1.000 | 0.255 | 0.127 | 0.513 | 0.103 | 0.166 | 0.000 | 0.000 |
| runtime | 0.229 | -0.213 | 0.255 | 1.000 | 0.196 | 0.298 | 0.032 | 0.094 | 0.111 | 0.000 |
| vote_average | 0.072 | -0.149 | 0.127 | 0.196 | 1.000 | 0.318 | -0.009 | 0.063 | 0.070 | 0.026 |
| vote_count | 0.484 | -0.433 | 0.513 | 0.298 | 0.318 | 1.000 | 0.197 | 0.150 | 0.000 | 0.000 |
| release_year | 0.141 | 0.392 | 0.103 | 0.032 | -0.009 | 0.197 | 1.000 | -0.083 | 0.144 | 0.027 |
| return | 0.145 | -0.202 | 0.166 | 0.094 | 0.063 | 0.150 | -0.083 | 1.000 | 0.000 | 0.000 |
| original_language | 0.000 | 0.071 | 0.000 | 0.111 | 0.070 | 0.000 | 0.144 | 0.000 | 1.000 | 0.072 |
| status | 0.000 | 0.053 | 0.000 | 0.000 | 0.026 | 0.000 | 0.027 | 0.000 | 0.072 | 1.000 |
| belongs_to_collection | budget | genres | id | original_language | overview | popularity | production_companies | production_countries | release_date | revenue | runtime | spoken_languages | status | tagline | title | vote_average | vote_count | release_year | return | name_collection | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | {'id': 10194, 'name': 'Toy Story Collection', 'poster_path': '/7G9915LfUQ2lVfwMEEhDsn3kT4B.jpg', 'backdrop_path': '/9FBwqcd9IRruEDUrTdcaafOMKUq.jpg'} | 30000000.0 | Animation, Comedy, Family | 862 | en | Led by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences. | 21.946943 | Pixar Animation Studios | United States of America | 1995-10-30 | 373554033.0 | 81.0 | English | Released | sin datos | Toy Story | 7.7 | 5415.0 | 1995 | 12.451801 | Toy Story Collection |
| 1 | {'name':'sin datos' } | 65000000.0 | Adventure, Fantasy, Family | 8844 | en | When siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures. | 17.015539 | TriStar Pictures, Teitler Film, Interscope Communications | United States of America | 1995-12-15 | 262797249.0 | 104.0 | English, Français | Released | Roll the dice and unleash the excitement! | Jumanji | 6.9 | 2413.0 | 1995 | 4.043035 | sin datos |
| 2 | {'id': 119050, 'name': 'Grumpy Old Men Collection', 'poster_path': '/nLvUdqgPgm3F85NMCii9gVFUcet.jpg', 'backdrop_path': '/hypTnLot2z8wpFS7qwsQHW1uV8u.jpg'} | 0.0 | Romance, Comedy | 15602 | en | A family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max. | 11.7129 | Warner Bros., Lancaster Gate | United States of America | 1995-12-22 | 0.0 | 101.0 | English | Released | Still Yelling. Still Fighting. Still Ready for Love. | Grumpier Old Men | 6.5 | 92.0 | 1995 | 0.000000 | Grumpy Old Men Collection |
| 3 | {'name':'sin datos' } | 16000000.0 | Comedy, Drama, Romance | 31357 | en | Cheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive "good man" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe. | 3.859495 | Twentieth Century Fox Film Corporation | United States of America | 1995-12-22 | 81452156.0 | 127.0 | English | Released | Friends are the people who let you be yourself... and never let you forget it. | Waiting to Exhale | 6.1 | 34.0 | 1995 | 5.090760 | sin datos |
| 4 | {'id': 96871, 'name': 'Father of the Bride Collection', 'poster_path': '/nts4iOmNnq7GNicycMJ9pSAn204.jpg', 'backdrop_path': '/7qwE57OVZmMJChBpLEbJEmzUydk.jpg'} | 0.0 | Comedy | 11862 | en | Just when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own. | 8.387519 | Sandollar Productions, Touchstone Pictures | United States of America | 1995-02-10 | 76578911.0 | 106.0 | English | Released | Just When His World Is Back To Normal... He's In For The Surprise Of His Life! | Father of the Bride Part II | 5.7 | 173.0 | 1995 | 0.000000 | Father of the Bride Collection |
| 5 | {'name':'sin datos' } | 60000000.0 | Action, Crime, Drama, Thriller | 949 | en | Obsessive master thief, Neil McCauley leads a top-notch crew on various insane heists throughout Los Angeles while a mentally unstable detective, Vincent Hanna pursues him without rest. Each man recognizes and respects the ability and the dedication of the other even though they are aware their cat-and-mouse game may end in violence. | 17.924927 | Regency Enterprises, Forward Pass, Warner Bros. | United States of America | 1995-12-15 | 187436818.0 | 170.0 | English, Español | Released | A Los Angeles Crime Saga | Heat | 7.7 | 1886.0 | 1995 | 3.123947 | sin datos |
| 6 | {'name':'sin datos' } | 58000000.0 | Comedy, Romance | 11860 | en | An ugly duckling having undergone a remarkable change, still harbors feelings for her crush: a carefree playboy, but not before his business-focused brother has something to say about it. | 6.677277 | Paramount Pictures, Scott Rudin Productions, Mirage Enterprises, Sandollar Productions, Constellation Entertainment, Worldwide, Mont Blanc Entertainment GmbH | Germany, United States of America | 1995-12-15 | 0.0 | 127.0 | Français, English | Released | You are cordially invited to the most surprising merger of the year. | Sabrina | 6.2 | 141.0 | 1995 | 0.000000 | sin datos |
| 7 | {'name':'sin datos' } | 0.0 | Action, Adventure, Drama, Family | 45325 | en | A mischievous young boy, Tom Sawyer, witnesses a murder by the deadly Injun Joe. Tom becomes friends with Huckleberry Finn, a boy with no future and no family. Tom has to choose between honoring a friendship or honoring an oath because the town alcoholic is accused of the murder. Tom and Huck go through several adventures trying to retrieve evidence. | 2.561161 | Walt Disney Pictures | United States of America | 1995-12-22 | 0.0 | 97.0 | English, Deutsch | Released | The Original Bad Boys. | Tom and Huck | 5.4 | 45.0 | 1995 | 0.000000 | sin datos |
| 8 | {'name':'sin datos' } | 35000000.0 | Action, Adventure, Thriller | 9091 | en | International action superstar Jean Claude Van Damme teams with Powers Boothe in a Tension-packed, suspense thriller, set against the back-drop of a Stanley Cup game.Van Damme portrays a father whose daughter is suddenly taken during a championship hockey game. With the captors demanding a billion dollars by game's end, Van Damme frantically sets a plan in motion to rescue his daughter and abort an impending explosion before the final buzzer... | 5.23158 | Universal Pictures, Imperial Entertainment, Signature Entertainment | United States of America | 1995-12-22 | 64350171.0 | 106.0 | English | Released | Terror goes into overtime. | Sudden Death | 5.5 | 174.0 | 1995 | 1.838576 | sin datos |
| 9 | {'id': 645, 'name': 'James Bond Collection', 'poster_path': '/HORpg5CSkmeQlAolx3bKMrKgfi.jpg', 'backdrop_path': '/6VcVl48kNKvdXOZfJPdarlUGOsk.jpg'} | 58000000.0 | Adventure, Action, Thriller | 710 | en | James Bond must unmask the mysterious head of the Janus Syndicate and prevent the leader from utilizing the GoldenEye weapons system to inflict devastating revenge on Britain. | 14.686036 | United Artists, Eon Productions | United Kingdom, United States of America | 1995-11-16 | 352194034.0 | 130.0 | English, Pусский, Español | Released | No limits. No fears. No substitutes. | GoldenEye | 6.6 | 1194.0 | 1995 | 6.072311 | James Bond Collection |
| belongs_to_collection | budget | genres | id | original_language | overview | popularity | production_companies | production_countries | release_date | revenue | runtime | spoken_languages | status | tagline | title | vote_average | vote_count | release_year | return | name_collection | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 45455 | {'name':'sin datos' } | 0.0 | sin datos | 67179 | it | Sentenced to life imprisonment for illegal activities, Italian International member Giulio Manieri holds on to his political ideals while struggling against madness in the loneliness of his prison cell. | 0.225051 | sin datos | sin datos | 1972-01-01 | 0.0 | 90.0 | Italiano | Released | sin datos | St. Michael Had a Rooster | 6.0 | 3.0 | 1972 | 0.0 | sin datos |
| 45456 | {'name':'sin datos' } | 0.0 | Horror, Mystery, Thriller | 84419 | en | An unsuccessful sculptor saves a madman named "The Creeper" from drowning. Seeing an opportunity for revenge, he tricks the psycho into murdering his critics. | 0.222814 | Universal Pictures | United States of America | 1946-03-29 | 0.0 | 65.0 | English | Released | Meet...The CREEPER! | House of Horrors | 6.3 | 8.0 | 1946 | 0.0 | sin datos |
| 45457 | {'name':'sin datos' } | 0.0 | Mystery, Horror | 390959 | en | In this true-crime documentary, we delve into the murder spree that was the inspiration for Joe Berlinger's "Book of Shadows: Blair Witch 2". | 0.076061 | sin datos | sin datos | 2000-10-22 | 0.0 | 45.0 | English | Released | sin datos | Shadow of the Blair Witch | 7.0 | 2.0 | 2000 | 0.0 | sin datos |
| 45458 | {'name':'sin datos' } | 0.0 | Horror | 289923 | en | A film archivist revisits the story of Rustin Parr, a hermit thought to have murdered seven children while under the possession of the Blair Witch. | 0.38645 | Neptune Salad Entertainment, Pirie Productions | United States of America | 2000-10-03 | 0.0 | 30.0 | English | Released | Do you know what happened 50 years before "The Blair Witch Project"? | The Burkittsville 7 | 7.0 | 1.0 | 2000 | 0.0 | sin datos |
| 45459 | {'name':'sin datos' } | 0.0 | Science Fiction | 222848 | en | It's the year 3000 AD. The world's most dangerous women are banished to a remote asteroid 45 million light years from earth. Kira Murphy doesn't belong; wrongfully accused of a crime she did not commit, she's thrown in this interplanetary prison and left to her own defenses. But Kira's a fighter, and soon she finds herself in the middle of a female gang war; where everyone wants a piece of the action... and a piece of her! "Caged Heat 3000" takes the Women-in-Prison genre to a whole new level... and a whole new galaxy! | 0.661558 | Concorde-New Horizons | United States of America | 1995-01-01 | 0.0 | 85.0 | English | Released | sin datos | Caged Heat 3000 | 3.5 | 1.0 | 1995 | 0.0 | sin datos |
| 45460 | {'name':'sin datos' } | 0.0 | Drama, Action, Romance | 30840 | en | Yet another version of the classic epic, with enough variation to make it interesting. The story is the same, but some of the characters are quite different from the usual, in particular Uma Thurman's very special maid Marian. The photography is also great, giving the story a somewhat darker tone. | 5.683753 | Westdeutscher Rundfunk (WDR), Working Title Films, 20th Century Fox Television, CanWest Global Communications | Canada, Germany, United Kingdom, United States of America | 1991-05-13 | 0.0 | 104.0 | English | Released | sin datos | Robin Hood | 5.7 | 26.0 | 1991 | 0.0 | sin datos |
| 45462 | {'name':'sin datos' } | 0.0 | Drama | 111109 | tl | An artist struggles to finish his work while a storyline about a cult plays in his head. | 0.178241 | Sine Olivia | Philippines | 2011-11-17 | 0.0 | 360.0 | sin datos | Released | sin datos | Century of Birthing | 9.0 | 3.0 | 2011 | 0.0 | sin datos |
| 45463 | {'name':'sin datos' } | 0.0 | Action, Drama, Thriller | 67758 | en | When one of her hits goes wrong, a professional assassin ends up with a suitcase full of a million dollars belonging to a mob boss ... | 0.903007 | American World Pictures | United States of America | 2003-08-01 | 0.0 | 90.0 | English | Released | A deadly game of wits. | Betrayal | 3.8 | 6.0 | 2003 | 0.0 | sin datos |
| 45464 | {'name':'sin datos' } | 0.0 | sin datos | 227506 | en | In a small town live two brothers, one a minister and the other one a hunchback painter of the chapel who lives with his wife. One dreadful and stormy night, a stranger knocks at the door asking for shelter. The stranger talks about all the good things of the earthly life the minister is missing because of his puritanical faith. The minister comes to accept the stranger's viewpoint but it is others who will pay the consequences because the minister will discover the human pleasures thanks to, ehem, his sister- in -law… The tormented minister and his cuckolded brother will die in a strange accident in the chapel and later an infant will be born from the minister's adulterous relationship. | 0.003503 | Yermoliev | Russia | 1917-10-21 | 0.0 | 87.0 | sin datos | Released | sin datos | Satan Triumphant | 0.0 | 0.0 | 1917 | 0.0 | sin datos |
| 45465 | {'name':'sin datos' } | 0.0 | sin datos | 461257 | en | 50 years after decriminalisation of homosexuality in the UK, director Daisy Asquith mines the jewels of the BFI archive to take us into the relationships, desires, fears and expressions of gay men and women in the 20th century. | 0.163015 | sin datos | United Kingdom | 2017-06-09 | 0.0 | 75.0 | English | Released | sin datos | Queerama | 0.0 | 0.0 | 2017 | 0.0 | sin datos |
Most frequently occurring
| belongs_to_collection | budget | genres | id | original_language | overview | release_date | revenue | runtime | spoken_languages | status | tagline | title | vote_average | vote_count | release_year | return | name_collection | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | {'id': 158365, 'name': 'Why We Fight', 'poster_path': '/fFYBLu2Hnx27CWLOMd425ExDkgK.jpg', 'backdrop_path': None} | 0.0 | Documentary | 159849 | en | The third film of Frank Capra's 'Why We Fight" propaganda film series, dealing with the Nazi conquest of Western Europe in 1940. | 1943-01-01 | 0.0 | 57.0 | English | Released | sin datos | Why We Fight: Divide and Conquer | 5.0 | 1.0 | 1943 | 0.0 | Why We Fight | 2 |
| 1 | {'id': 34055, 'name': 'Pokémon Collection', 'poster_path': '/j5te0YNZAMXDBnsqTUDKIBEt8iu.jpg', 'backdrop_path': '/iGoYKA0TFfgSoZpG2u5viTJMGfK.jpg'} | 0.0 | Adventure, Fantasy, Animation, Science Fiction, Family | 12600 | ja | All your favorite Pokémon characters are back, and are joined for the first time by the legendary Pokémon Celebi and Suicune, in this latest exciting Pokémon adventure! In order to escape a greedy Pokémon hunter, Celebi must use the last of its energy to travel through time to the present day. Celebi brings along Sammy, a boy who had been trying to protect it. Along with Ash, Pikachu, and the rest of the gang, Sammy and Celebi encounter an enemy far more advanced than the hunter left behind in the past. This new enemy possesses a Pokéball called a “Dark Ball,” which transforms the Pokémon it captures into evil and far stronger creatures. When Celebi is captured, the fate of the entire forest is threatened. Let POKÉMON 4EVER transport you to a world of adventure as Ash, Suicune and the rest take action to save the day! | 2001-07-06 | 28023563.0 | 75.0 | 日本語 | Released | sin datos | Pokémon 4Ever: Celebi - Voice of the Forest | 5.7 | 82.0 | 2001 | 0.0 | Pokémon Collection | 2 |
| 2 | {'name':'sin datos' } | 0.0 | Action, Drama, Romance, Adventure | 99080 | en | Originally called White Thunder, American producer Varick Frissell's 1931 film was inspired by his love for the Canadian Arctic Circle. Set in a beautifully black-and-white filmed Newfoundland, it is the story of a rivalry between two seal hunters that plays out on the ice floes during a hunt. Unsatisfied with the first cut, Frissell arranged for the crew to accompany an actual Newfoundland seal hunt on The SS Viking, on which an explosion of dynamite (carried regularly at the time on Arctic ships to combat ice jams) killed many members of the crew, including Frissell. The film was renamed in honor of the dead. | 1931-06-21 | 0.0 | 70.0 | English | Released | Actually produced during the Great Newfoundland Seal Hunt and You see the REAL thing | The Viking | 0.0 | 0.0 | 1931 | 0.0 | sin datos | 2 |
| 3 | {'name':'sin datos' } | 0.0 | Action, Horror, Science Fiction | 18440 | en | When a comet strikes Earth and kicks up a cloud of toxic dust, hundreds of humans join the ranks of the living dead. But there's bad news for the survivors: The newly minted zombies are hell-bent on eradicating every last person from the planet. For the few human beings who remain, going head to head with the flesh-eating fiends is their only chance for long-term survival. Yet their battle will be dark and cold, with overwhelming odds. | 2007-01-01 | 0.0 | 89.0 | English | Released | sin datos | Days of Darkness | 5.0 | 5.0 | 2007 | 0.0 | sin datos | 2 |
| 4 | {'name':'sin datos' } | 0.0 | Adventure, Animation, Drama, Action, Foreign | 23305 | en | In feudal India, a warrior (Khan) who renounces his role as the longtime enforcer to a local lord becomes the prey in a murderous hunt through the Himalayan mountains. | 2001-09-23 | 0.0 | 86.0 | हिन्दी | Released | sin datos | The Warrior | 6.3 | 15.0 | 2001 | 0.0 | sin datos | 2 |
| 5 | {'name':'sin datos' } | 0.0 | Comedy, Drama | 265189 | sv | While holidaying in the French Alps, a Swedish family deals with acts of cowardliness as an avalanche breaks out. | 2014-08-15 | 1359497.0 | 118.0 | Français, Norsk, svenska, English | Released | sin datos | Force Majeure | 6.8 | 255.0 | 2014 | 0.0 | sin datos | 2 |
| 6 | {'name':'sin datos' } | 0.0 | Crime, Drama, Thriller | 5511 | fr | Hitman Jef Costello is a perfectionist who always carefully plans his murders and who never gets caught. | 1967-10-25 | 39481.0 | 105.0 | Français | Released | There is no solitude greater than that of the Samurai | Le Samouraï | 7.9 | 187.0 | 1967 | 0.0 | sin datos | 2 |
| 7 | {'name':'sin datos' } | 0.0 | Documentary | 84198 | en | Using personal stories, this powerful documentary illuminates the plight of the 49 million Americans struggling with food insecurity. A single mother, a small-town policeman and a farmer are among those for whom putting food on the table is a daily battle. | 2012-03-22 | 0.0 | 84.0 | English | Released | One Nation. Underfed. | A Place at the Table | 6.9 | 7.0 | 2012 | 0.0 | sin datos | 2 |
| 8 | {'name':'sin datos' } | 0.0 | Drama | 109962 | en | Two literary women compete for 20 years: one writes for the critics; the other one, to get rich. | 1981-09-23 | 0.0 | 115.0 | English | Released | From the very beginning, they knew they'd be friends to the end. What they didn't count on was everything in between. | Rich and Famous | 4.9 | 7.0 | 1981 | 0.0 | sin datos | 2 |
| 9 | {'name':'sin datos' } | 0.0 | Drama, Comedy | 168538 | en | In Zola's Paris, an ingenue arrives at a tony bordello: she's Nana, guileless, but quickly learning to use her erotic innocence to get what she wants. She's an actress for a soft-core filmmaker and soon is the most popular courtesan in Paris, parlaying this into a house, bought for her by a wealthy banker. She tosses him and takes up with her neighbor, a count of impeccable rectitude, and with the count's impressionable son. The count is soon fetching sticks like a dog and mortgaging his lands to satisfy her whims. | 1983-06-13 | 0.0 | 92.0 | sin datos | Released | sin datos | Nana, the True Key of Pleasure | 4.7 | 3.0 | 1983 | 0.0 | sin datos | 2 |